Code Monkey home page Code Monkey logo

orepa_cvpr2022's People

Contributors

jerexjs avatar jugghm avatar sixkplus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

orepa_cvpr2022's Issues

about accuracy of ResNet34

Hello, I'm very confused about accuracy of ResNet34. Specifically, I train ResNet34 many time, but accuracy of ResNet34 is about 74.40. I found that this paper and RepVGG both report accuracy about 74.13. I comfirm that my setting is identical with RepVGG including the number of devices. could you help me please?

I think there is a minor error

Hi, thanks for making your code public. It is really great work!
I ran your code, and i think there is a minor error in your code.

On line 217 in train.py
lr_scheduler = WarmupCosineAnnealingLR(optimizer=optimizer, T_cosine_max=args.epochs * IMAGENET_TRAINSET_SIZE // args.batch_size // ngpus_per_node, warmup=args.epochs/24)

I think it makes warm up the learning rate during only 5 steps, not 5 epochs. To warm up the learning rate during 5 epochs, 'args.epochs/24' should be 'args.epochs*len(train_loader)/24'.

Therefore, I modify line 217 as follows,
lr_scheduler = WarmupCosineAnnealingLR(optimizer=optimizer, T_cosine_max=args.epochs * len(train_loader), warmup=args.epochs * len(train_loader) / 24)

Thank you!

关于scaling

作者你好,论文提到用scaling替代bn,但是为什么源码中还是用bn?

OREPA_LargeConvBase result is wrong

Hi,
When I was reproducing OREPA_LargeConvBase, I found some results that were different from what I expected. Could you tell me the reason for this result?

import torch
from torch import nn as nn
import torch.nn.functional as F
import torch.nn.init as init
import math

weight = nn.Parameter(torch.Tensor(128, 64, 3, 3))
weight1 = nn.Parameter(torch.Tensor(128, 128, 3, 3))
weight2 = nn.Parameter(torch.Tensor(64, 128, 3, 3))

init.kaiming_uniform_(weight, a=math.sqrt(5))
init.kaiming_uniform_(weight1, a=math.sqrt(5))
init.kaiming_uniform_(weight2, a=math.sqrt(5))

rep_weight = weight.transpose(0, 1)
rep_weight = F.conv2d(rep_weight, weight1, groups=1, padding=2)
rep_weight = F.conv2d(rep_weight, weight2, groups=1, padding=2)
rep_weight = rep_weight.transpose(0, 1)

data = torch.randn((1, 64, 1080, 1920)) * 255
conv_result = F.conv2d(F.conv2d(F.conv2d(data, weight=weight, padding=1), weight=weight1, padding=1), weight=weight2, padding=1)
rep_result = F.conv2d(input=data, weight=rep_weight, bias=None, stride=1, padding=3)

diff = torch.abs(rep_result - conv_result)
print(f"max diff: {diff.max()}")
print(f"median diff: {diff.median()}")
print(f"mean diff: {diff.mean()}")


# max diff: 365.46533203125
# median diff: 44.0756950378418
# mean diff: 52.17301559448242

about linear deep stem

Hi,
Thank you for your great work.
Where is the implementation code for the "Linear Deep Dry" method?
Is it in "OREPA_LargeConvBase"?

Numerical Stability

Hi, I'm wondering if you've run into any issues with numerical stability or know what may be the cause.

With normal RepVGG, I get differences as high as 4e-4 comparing before and after switching to deploy. After changing first conv to OREPA_LargeConv, I get errors as high as 2e-3. After changing the 1x1 conv in the RepVGG block to OREPA_1x1, I get differences as high as 0.1.

It seems numerical stability makes it challenging to use identity + OREPA_1x1 + OREPA_3x3 blocks in RepVGG style model. Any thoughts about why?

About weight similarity across branches.

Hi, thanks for your great work!

I tried to reproduce the visualization of branch-level similarity of OREPA blocks, but the unexpected results emerged.

Could you share details about it?

可以解释下Proposition 1吗,没太明白,谢谢

Proposition 1 A single-branch linear mapping, when re-parameterizing parts or all of it by over-two-layer multi-branch topologies, the entire end-to-end weight matrix will
be differently optimized. If one layer of the mapping is re-parameterized to up-to-one-layer multi-branch topologies,
the optimization will remain unchanged.

关于OREPA+RepVGG的实验结果

你好:
我看文章里OREPA和RepVGG结合时,是直接在conv_33上加OREPA, 而不是直接将conv_33/conv_11/identity三个分支换成OREPA的形式。请问这样做的是因为直接将conv_33/conv_1*1/identity三个分支换成OREPA进行训练效果不好么?

About orepa block extending to 3D

作者你好,非常喜歡你這篇文章的idea,然後我現在是想extend到3D,但是我沒有太看懂在OREPA function裡的self.fre_init的作用,可以再解釋一下ma? 然後對於prior_tensor,我們能變成3D的嗎?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.