Code Monkey home page Code Monkey logo

frgfm / holocron Goto Github PK

View Code? Open in Web Editor NEW
302.0 302.0 47.0 275.93 MB

PyTorch implementations of recent Computer Vision tricks (ReXNet, RepVGG, Unet3p, YOLOv4, CIoU loss, AdaBelief, PolyLoss, MobileOne)

Home Page: https://frgfm.github.io/Holocron/

License: Apache License 2.0

Python 99.74% Makefile 0.19% Dockerfile 0.07%
computer-vision cspdarknet53 darknet deep-learning object-detection pytorch resnet rexnet tridentnet unet-image-segmentation yolo yolov4

holocron's Introduction

Hello there

Hi! I'm F-G (short for FranΓ§ois-Guillaume πŸ‡«πŸ‡·)

  • πŸ‘¨β€πŸ¦± I'm a Deep Learning Engineer (Computer Vision & NLP) by day, an Open Source contributor by night πŸ¦‡
  • πŸ’Ό Currently brewing a company of my own @quack-ai (YC S23), I co-founded the NGO @PyroNear & volunteer @dataforgoodfr
  • ❀️ I'm passionate about open source, machine perception, astrophysics & environment protection
  • 🌱 Currently learning about Front-end
  • πŸ˜„ What I actually do in my spare time πŸ‰ 🎹 πŸ“· | What I wish I could do more often πŸ„ ⛷️
Tech stack πŸ’»
  • Programming languages: Python JavaScript C++ Bash CUDA SQL
  • Libraries & frameworks: PyTorch TensorFlow NumPy OpenCV Pandas Streamlit Gradio FastAPI Next.js
  • What I mostly work on: Ubuntu Windows Raspberry Pi
  • How I write fancy equations: LaTeX
  • Where I share my work: GitHub Hugging Face Spaces PyPi Anaconda VS Marketplace
  • How I serve my applications: Poetry Yarn Docker AWS OVH Vercel
  • Coming soon: Node JS Unity JAX Go Swift
Stats πŸ“ˆ

FG's github stats FG top languages

holocron's People

Contributors

dependabot[bot] avatar frgfm avatar imgbot[bot] avatar mateolostanlen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

holocron's Issues

Inplace bias correction

exp_avg.div_(bias_correction1)

Here bias correction is made inplace so it affects exp_avg in the next steps (bias correction accumulates). I wonder if that was intentional?
Original Adam doesn't accomulate bias correction, it is applied on each step to calculate momentums with hat.
image

PolyLoss doesn't support soft targets

Bug description

Hello, @frgfm
When calling the PolyLoss() function, I got the following runtime error, but didn't know how to solve it.
Please help~

Code snippet to reproduce the bug

directly invoke PolyLoss() in my framework

Error traceback

File "/path/to/poly_loss.py", in forward
    return poly_loss(x, target, self.eps, self.weight, self.ignore_index, self.reduction)
File "/path/to/poly_loss.py", in poly_loss
    logpt = logpt.transpose(1, 0).flatten(1).gather(0, target.view(1, -1)).squeeze()
RuntimeError: gather(): Expected dtype int64 for index

Environment

python 3.10

polyloss not working when target contains `ignore_index`

Bug description

polyloss not working when target contains ignore_index

Code snippet to reproduce the bug

labels = [-100,2,1]
logits = [[-2.0605, -1.0522,  1.0922],[1.0303, -2.0048,  1.0727],[-1.1031,  1.0414, -2.0464]]
labels_pt = torch.tensor(labels,dtype=torch.int64)
logits_pt = torch.tensor(logits,dtype=torch.float32)

loss = poly_loss(logits_pt,labels_pt,reduction='none')

print('loss',loss)

Error traceback

index -100 is out of bounds for dimension 0 with size 3

Environment

[trainer] Add an image target size suggestion

Similarly to the LR finder, Holocron should integrate a suggestion for image target sizes. The goal is to limit oversampling / downsampling side effects while respecting aspect ratios.

Here is how this could be done:

  • a shape_recorder go through all images to produce the list of heights and widths
  • then we produce two arrays: aratio (height divided by width), and side (square root of height multiplied by width)
  • now, we want a range of values for each of those that fit a high percentage of the distribution
  • then we set the aspect ratio, and take the resolved side. The target size will be : side * sqrt(aratio), side / sqrt(aratio)

Pretrained object detection

is there a pretrained model available for object detection. I tried using yolov4 but getting error in loading pretrained weight as there is no url in the script.

Error Resnet pretrained

πŸ› Bug

Hi, I want to use a pretrained 50 resnet but I have this error which I am having trouble resolving.
It spawns for resnet 18 and resnet 101.
I have not tested with the others.

To Reproduce

Steps to reproduce the behavior:

pretrained = True
backbone= "resnet50"
model = holocron.models.__dict__[backbone](
       pretrained, num_classes=1)

File "/.../.../Holocron/holocron/models/resnet.py", line 327, in resnet50
return _resnet('resnet50', pretrained, progress, **kwargs)
File "/.../.../Holocron/holocron/models/resnet.py", line 280, in _resnet
load_pretrained_params(model, default_cfgs[arch]['url'], progress)
File "/.../.../Holocron/holocron/models/utils.py", line 68, in load_pretrained_params
model.load_state_dict(state_dict)
File "/.../.../mainvenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResNet:

Missing key(s) in state_dict: "features.0.weight", "features.1.weight", "features.1.bias", "features.1.running_mean", "features.1.running_var", "features.4.0.conv.0.weight", "features.4.0.conv.1.weight", "features.4.0.conv.1.bias", "features.4.0.conv.1.running_mean", "features.4.0.conv.1.running_var", "features.4.0.conv.3.weight", "features.4.0.conv.4.weight", "features.4.0.conv.4.bias", "features.4.0.conv.4.running_mean", "features.4.0.conv.4.running_var", "features.4.0.conv.6.weight", "features.4.0.conv.7.weight", "features.4.0.conv.7.bias", "features.4.0.conv.7.running_mean", "features.4.0.conv.7.running_var", "features.4.0.downsample.0.weight", "features.4.0.downsample.1.weight", "features.4.0.downsample.1.bias", "features.4.0.downsample.1.running_mean", "features.4.0.downsample.1.running_var", "features.4.1.conv.0.weight", "features.4.1.conv.1.weight", "features.4.1.conv.1.bias", "features.4.1.conv.1.running_mean", "features.4.1.conv.1.running_var", "features.4.1.conv.3.weight", "features.4.1.conv.4.weight", "features.4.1.conv.4.bias", "features.4.1.conv.4.running_mean", "features.4.1.conv.4.running_var", "features.4.1.conv.6.weight", "features.4.1.conv.7.weight", "features.4.1.conv.7.bias", "features.4.1.conv.7.running_mean", "features.4.1.conv.7.running_var", "features.4.2.conv.0.weight", "features.4.2.conv.1.weight", "features.4.2.conv.1.bias", "features.4.2.conv.1.running_mean", "features.4.2.conv.1.running_var", "features.4.2.conv.3.weight", "features.4.2.conv.4.weight", "features.4.2.conv.4.bias", "features.4.2.conv.4.running_mean", "features.4.2.conv.4.running_var", "features.4.2.conv.6.weight", "features.4.2.conv.7.weight", "features.4.2.conv.7.bias", "features.4.2.conv.7.running_mean", "features.4.2.conv.7.running_var", "features.5.0.conv.0.weight", "features.5.0.conv.1.weight", "features.5.0.conv.1.bias", "features.5.0.conv.1.running_mean", "features.5.0.conv.1.running_var", "features.5.0.conv.3.weight", "features.5.0.conv.4.weight", "features.5.0.conv.4.bias", "features.5.0.conv.4.running_mean", "features.5.0.conv.4.running_var", "features.5.0.conv.6.weight", "features.5.0.conv.7.weight", "features.5.0.conv.7.bias", "features.5.0.conv.7.running_mean", "features.5.0.conv.7.running_var", "features.5.0.downsample.0.weight", "features.5.0.downsample.1.weight", "features.5.0.downsample.1.bias", "features.5.0.downsample.1.running_mean", "features.5.0.downsample.1.running_var", "features.5.1.conv.0.weight", "features.5.1.conv.1.weight", "features.5.1.conv.1.bias", "features.5.1.conv.1.running_mean", "features.5.1.conv.1.running_var", "features.5.1.conv.3.weight", "features.5.1.conv.4.weight", "features.5.1.conv.4.bias", "features.5.1.conv.4.running_mean", "features.5.1.conv.4.running_var", "features.5.1.conv.6.weight", "features.5.1.conv.7.weight", "features.5.1.conv.7.bias", "features.5.1.conv.7.running_mean", "features.5.1.conv.7.running_var", "features.5.2.conv.0.weight", "features.5.2.conv.1.weight", "features.5.2.conv.1.bias", "features.5.2.conv.1.running_mean", "features.5.2.conv.1.running_var", "features.5.2.conv.3.weight", "features.5.2.conv.4.weight", "features.5.2.conv.4.bias", "features.5.2.conv.4.running_mean", "features.5.2.conv.4.running_var", "features.5.2.conv.6.weight", "features.5.2.conv.7.weight", "features.5.2.conv.7.bias", "features.5.2.conv.7.running_mean", "features.5.2.conv.7.running_var", "features.5.3.conv.0.weight", "features.5.3.conv.1.weight", "features.5.3.conv.1.bias", "features.5.3.conv.1.running_mean", "features.5.3.conv.1.running_var", "features.5.3.conv.3.weight", "features.5.3.conv.4.weight", "features.5.3.conv.4.bias", "features.5.3.conv.4.running_mean", "features.5.3.conv.4.running_var", "features.5.3.conv.6.weight", "features.5.3.conv.7.weight", "features.5.3.conv.7.bias", "features.5.3.conv.7.running_mean", "features.5.3.conv.7.running_var", "features.6.0.conv.0.weight", "features.6.0.conv.1.weight", "features.6.0.conv.1.bias", "features.6.0.conv.1.running_mean", "features.6.0.conv.1.running_var", "features.6.0.conv.3.weight", "features.6.0.conv.4.weight", "features.6.0.conv.4.bias", "features.6.0.conv.4.running_mean", "features.6.0.conv.4.running_var", "features.6.0.conv.6.weight", "features.6.0.conv.7.weight", "features.6.0.conv.7.bias", "features.6.0.conv.7.running_mean", "features.6.0.conv.7.running_var", "features.6.0.downsample.0.weight", "features.6.0.downsample.1.weight", "features.6.0.downsample.1.bias", "features.6.0.downsample.1.running_mean", "features.6.0.downsample.1.running_var", "features.6.1.conv.0.weight", "features.6.1.conv.1.weight", "features.6.1.conv.1.bias", "features.6.1.conv.1.running_mean", "features.6.1.conv.1.running_var", "features.6.1.conv.3.weight", "features.6.1.conv.4.weight", "features.6.1.conv.4.bias", "features.6.1.conv.4.running_mean", "features.6.1.conv.4.running_var", "features.6.1.conv.6.weight", "features.6.1.conv.7.weight", "features.6.1.conv.7.bias", "features.6.1.conv.7.running_mean", "features.6.1.conv.7.running_var", "features.6.2.conv.0.weight", "features.6.2.conv.1.weight", "features.6.2.conv.1.bias", "features.6.2.conv.1.running_mean", "features.6.2.conv.1.running_var", "features.6.2.conv.3.weight", "features.6.2.conv.4.weight", "features.6.2.conv.4.bias", "features.6.2.conv.4.running_mean", "features.6.2.conv.4.running_var", "features.6.2.conv.6.weight", "features.6.2.conv.7.weight", "features.6.2.conv.7.bias", "features.6.2.conv.7.running_mean", "features.6.2.conv.7.running_var", "features.6.3.conv.0.weight", "features.6.3.conv.1.weight", "features.6.3.conv.1.bias", "features.6.3.conv.1.running_mean", "features.6.3.conv.1.running_var", "features.6.3.conv.3.weight", "features.6.3.conv.4.weight", "features.6.3.conv.4.bias", "features.6.3.conv.4.running_mean", "features.6.3.conv.4.running_var", "features.6.3.conv.6.weight", "features.6.3.conv.7.weight", "features.6.3.conv.7.bias", "features.6.3.conv.7.running_mean", "features.6.3.conv.7.running_var", "features.6.4.conv.0.weight", "features.6.4.conv.1.weight", "features.6.4.conv.1.bias", "features.6.4.conv.1.running_mean", "features.6.4.conv.1.running_var", "features.6.4.conv.3.weight", "features.6.4.conv.4.weight", "features.6.4.conv.4.bias", "features.6.4.conv.4.running_mean", "features.6.4.conv.4.running_var", "features.6.4.conv.6.weight", "features.6.4.conv.7.weight", "features.6.4.conv.7.bias", "features.6.4.conv.7.running_mean", "features.6.4.conv.7.running_var", "features.6.5.conv.0.weight", "features.6.5.conv.1.weight", "features.6.5.conv.1.bias", "features.6.5.conv.1.running_mean", "features.6.5.conv.1.running_var", "features.6.5.conv.3.weight", "features.6.5.conv.4.weight", "features.6.5.conv.4.bias", "features.6.5.conv.4.running_mean", "features.6.5.conv.4.running_var", "features.6.5.conv.6.weight", "features.6.5.conv.7.weight", "features.6.5.conv.7.bias", "features.6.5.conv.7.running_mean", "features.6.5.conv.7.running_var", "features.7.0.conv.0.weight", "features.7.0.conv.1.weight", "features.7.0.conv.1.bias", "features.7.0.conv.1.running_mean", "features.7.0.conv.1.running_var", "features.7.0.conv.3.weight", "features.7.0.conv.4.weight", "features.7.0.conv.4.bias", "features.7.0.conv.4.running_mean", "features.7.0.conv.4.running_var", "features.7.0.conv.6.weight", "features.7.0.conv.7.weight", "features.7.0.conv.7.bias", "features.7.0.conv.7.running_mean", "features.7.0.conv.7.running_var", "features.7.0.downsample.0.weight", "features.7.0.downsample.1.weight", "features.7.0.downsample.1.bias", "features.7.0.downsample.1.running_mean", "features.7.0.downsample.1.running_var", "features.7.1.conv.0.weight", "features.7.1.conv.1.weight", "features.7.1.conv.1.bias", "features.7.1.conv.1.running_mean", "features.7.1.conv.1.running_var", "features.7.1.conv.3.weight", "features.7.1.conv.4.weight", "features.7.1.conv.4.bias", "features.7.1.conv.4.running_mean", "features.7.1.conv.4.running_var", "features.7.1.conv.6.weight", "features.7.1.conv.7.weight", "features.7.1.conv.7.bias", "features.7.1.conv.7.running_mean", "features.7.1.conv.7.running_var", "features.7.2.conv.0.weight", "features.7.2.conv.1.weight", "features.7.2.conv.1.bias", "features.7.2.conv.1.running_mean", "features.7.2.conv.1.running_var", "features.7.2.conv.3.weight", "features.7.2.conv.4.weight", "features.7.2.conv.4.bias", "features.7.2.conv.4.running_mean", "features.7.2.conv.4.running_var", "features.7.2.conv.6.weight", "features.7.2.conv.7.weight", "features.7.2.conv.7.bias", "features.7.2.conv.7.running_mean", "features.7.2.conv.7.running_var", "head.weight", "head.bias". 
        
Unexpected key(s) in state_dict: "conv1.weight", "bn1.running_mean", "bn1.running_var", "bn1.weight", "bn1.bias", "layer1.0.conv1.weight", "layer1.0.bn1.running_mean", "layer1.0.bn1.running_var", "layer1.0.bn1.weight", "layer1.0.bn1.bias", "layer1.0.conv2.weight", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.0.conv3.weight", "layer1.0.bn3.running_mean", "layer1.0.bn3.running_var", "layer1.0.bn3.weight", "layer1.0.bn3.bias", "layer1.0.downsample.0.weight", "layer1.0.downsample.1.running_mean", "layer1.0.downsample.1.running_var", "layer1.0.downsample.1.weight", "layer1.0.downsample.1.bias", "layer1.1.conv1.weight", "layer1.1.bn1.running_mean", "layer1.1.bn1.running_var", "layer1.1.bn1.weight", "layer1.1.bn1.bias", "layer1.1.conv2.weight", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.1.conv3.weight", "layer1.1.bn3.running_mean", "layer1.1.bn3.running_var", "layer1.1.bn3.weight", "layer1.1.bn3.bias", "layer1.2.conv1.weight", "layer1.2.bn1.running_mean", "layer1.2.bn1.running_var", "layer1.2.bn1.weight", "layer1.2.bn1.bias", "layer1.2.conv2.weight", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer1.2.conv3.weight", "layer1.2.bn3.running_mean", "layer1.2.bn3.running_var", "layer1.2.bn3.weight", "layer1.2.bn3.bias", "layer2.0.conv1.weight", "layer2.0.bn1.running_mean", "layer2.0.bn1.running_var", "layer2.0.bn1.weight", "layer2.0.bn1.bias", "layer2.0.conv2.weight", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.0.conv3.weight", "layer2.0.bn3.running_mean", "layer2.0.bn3.running_var", "layer2.0.bn3.weight", "layer2.0.bn3.bias", "layer2.0.downsample.0.weight", "layer2.0.downsample.1.running_mean", "layer2.0.downsample.1.running_var", "layer2.0.downsample.1.weight", "layer2.0.downsample.1.bias", "layer2.1.conv1.weight", "layer2.1.bn1.running_mean", "layer2.1.bn1.running_var", "layer2.1.bn1.weight", "layer2.1.bn1.bias", "layer2.1.conv2.weight", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.1.conv3.weight", "layer2.1.bn3.running_mean", "layer2.1.bn3.running_var", "layer2.1.bn3.weight", "layer2.1.bn3.bias", "layer2.2.conv1.weight", "layer2.2.bn1.running_mean", "layer2.2.bn1.running_var", "layer2.2.bn1.weight", "layer2.2.bn1.bias", "layer2.2.conv2.weight", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.2.conv3.weight", "layer2.2.bn3.running_mean", "layer2.2.bn3.running_var", "layer2.2.bn3.weight", "layer2.2.bn3.bias", "layer2.3.conv1.weight", "layer2.3.bn1.running_mean", "layer2.3.bn1.running_var", "layer2.3.bn1.weight", "layer2.3.bn1.bias", "layer2.3.conv2.weight", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer2.3.conv3.weight", "layer2.3.bn3.running_mean", "layer2.3.bn3.running_var", "layer2.3.bn3.weight", "layer2.3.bn3.bias", "layer3.0.conv1.weight", "layer3.0.bn1.running_mean", "layer3.0.bn1.running_var", "layer3.0.bn1.weight", "layer3.0.bn1.bias", "layer3.0.conv2.weight", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.0.conv3.weight", "layer3.0.bn3.running_mean", "layer3.0.bn3.running_var", "layer3.0.bn3.weight", "layer3.0.bn3.bias", "layer3.0.downsample.0.weight", "layer3.0.downsample.1.running_mean", "layer3.0.downsample.1.running_var", "layer3.0.downsample.1.weight", "layer3.0.downsample.1.bias", "layer3.1.conv1.weight", "layer3.1.bn1.running_mean", "layer3.1.bn1.running_var", "layer3.1.bn1.weight", "layer3.1.bn1.bias", "layer3.1.conv2.weight", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.1.conv3.weight", "layer3.1.bn3.running_mean", "layer3.1.bn3.running_var", "layer3.1.bn3.weight", "layer3.1.bn3.bias", "layer3.2.conv1.weight", "layer3.2.bn1.running_mean", "layer3.2.bn1.running_var", "layer3.2.bn1.weight", "layer3.2.bn1.bias", "layer3.2.conv2.weight", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.2.conv3.weight", "layer3.2.bn3.running_mean", "layer3.2.bn3.running_var", "layer3.2.bn3.weight", "layer3.2.bn3.bias", "layer3.3.conv1.weight", "layer3.3.bn1.running_mean", "layer3.3.bn1.running_var", "layer3.3.bn1.weight", "layer3.3.bn1.bias", "layer3.3.conv2.weight", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.3.conv3.weight", "layer3.3.bn3.running_mean", "layer3.3.bn3.running_var", "layer3.3.bn3.weight", "layer3.3.bn3.bias", "layer3.4.conv1.weight", "layer3.4.bn1.running_mean", "layer3.4.bn1.running_var", "layer3.4.bn1.weight", "layer3.4.bn1.bias", "layer3.4.conv2.weight", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.4.conv3.weight", "layer3.4.bn3.running_mean", "layer3.4.bn3.running_var", "layer3.4.bn3.weight", "layer3.4.bn3.bias", "layer3.5.conv1.weight", "layer3.5.bn1.running_mean", "layer3.5.bn1.running_var", "layer3.5.bn1.weight", "layer3.5.bn1.bias", "layer3.5.conv2.weight", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.5.conv3.weight", "layer3.5.bn3.running_mean", "layer3.5.bn3.running_var", "layer3.5.bn3.weight", "layer3.5.bn3.bias", "layer4.0.conv1.weight", "layer4.0.bn1.running_mean", "layer4.0.bn1.running_var", "layer4.0.bn1.weight", "layer4.0.bn1.bias", "layer4.0.conv2.weight", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.0.conv3.weight", "layer4.0.bn3.running_mean", "layer4.0.bn3.running_var", "layer4.0.bn3.weight", "layer4.0.bn3.bias", "layer4.0.downsample.0.weight", "layer4.0.downsample.1.running_mean", "layer4.0.downsample.1.running_var", "layer4.0.downsample.1.weight", "layer4.0.downsample.1.bias", "layer4.1.conv1.weight", "layer4.1.bn1.running_mean", "layer4.1.bn1.running_var", "layer4.1.bn1.weight", "layer4.1.bn1.bias", "layer4.1.conv2.weight", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.1.conv3.weight", "layer4.1.bn3.running_mean", "layer4.1.bn3.running_var", "layer4.1.bn3.weight", "layer4.1.bn3.bias", "layer4.2.conv1.weight", "layer4.2.bn1.running_mean", "layer4.2.bn1.running_var", "layer4.2.bn1.weight", "layer4.2.bn1.bias", "layer4.2.conv2.weight", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "layer4.2.bn2.weight", "layer4.2.bn2.bias", "layer4.2.conv3.weight", "layer4.2.bn3.running_mean", "layer4.2.bn3.running_var", "layer4.2.bn3.weight", "layer4.2.bn3.bias", "fc.weight", "fc.bias". 

Expected behavior

Environment

  • PyTorch Version (e.g., 1.7): torch==1.8.1+cu111 torchvision==0.9.1+cu111

  • OS (e.g., Linux): Ubuntu 18.04

  • Python version: 3.7.9

  • CUDA/cuDNN version: 11

  • GPU models and configuration: RTX 2080TI

  • Any other relevant information:

LAMB: Differences from the paper author's official implementation

The LAMB implementation of the PyTorch version you released is different from the official version of TensorFlow released by the paper author. According to the official implementation published in the paper, the author's code implementation skips some parameters according to their names() when calculating. But in your implementation, it seems that all parameters are directly involved in the calculation.
For example, exclude_from_weight_decay=["batch_normalization", "LayerNorm", "layer_norm"]
Their implementation:
https://github.com/tensorflow/addons/blob/master/tensorflow_addons/optimizers/lamb.py

cIoU loss calculation question

Thank you for design this great tool!
I found the implementation of CIoU calculation is slightly different from the original paper(Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression)

The ciou_loss() function calculate the final loss with code:
ciou_loss[_filter].addcdiv_(v[_filter] ** 2, 1 - iou[_filter] + v[_filter])

The alpha in the paper is formulated as: V/(1-IoU)+V
whereas the code loos like calculating (v^2)/((1-IoU)+v)
I'd like to ask if this is the better design comparing to the original CIoU loss.
Anyway, thank you for the implementation, it really saves me a lot of time!

Collaboration

Hello there,

I am the author of glasses what if we join forces?

May the force be with you!

Francesco

Release tracker - v0.2.1

This issue is to be used to track the roadmap of Holocron for release v0.2.1, and collect feedback from users & contributors.

[trainer] Add a loss recorder option

Once a model has been trained, it would be nice to analyze its main failures. For that, one could:

  • go through the training set computing the loss (unreduced on the batch dimension)
  • keep the N samples with the highest loss
  • plot the image, the loss, the prediction and the target

A later option would be to display the CAM to better understand the model behaviour.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.