Code Monkey home page Code Monkey logo

invertible-resnet's People

Contributors

dependabot[bot] avatar jhjacobsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

invertible-resnet's Issues

Undefined names: Missing imports?

flake8 testing of https://github.com/jhjacobsen/invertible-resnet on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./CIFAR_main.py:311:102: F821 undefined name 'full_fname'
        interpolate(model, testloader, testset, start_epoch, use_cuda, best_objective, args.dataset, full_fname)
                                                                                                     ^
./models/utils_toy_densities.py:250:19: F821 undefined name 'model'
    clip_fc_layer(model, coeff, use_cuda)
                  ^
./models/utils_toy_densities.py:250:26: F821 undefined name 'coeff'
    clip_fc_layer(model, coeff, use_cuda)
                         ^
./models/utils_toy_densities.py:250:33: F821 undefined name 'use_cuda'
    clip_fc_layer(model, coeff, use_cuda)
                                ^
./models/utils_toy_densities.py:257:41: F821 undefined name 'model'
    out_bij, p_z_g_y, trace, gt_trace = model(inputs)
                                        ^
./models/utils_toy_densities.py:258:31: F821 undefined name 'model'
    log_det = compute_log_det(model, inputs, out_bij)
                              ^
./models/utils_toy_densities.py:277:41: F821 undefined name 'model'
    out_bij, p_z_g_y, trace, gt_trace = model(inputs)
                                        ^
./models/utils_toy_densities.py:278:31: F821 undefined name 'model'
    log_det = compute_log_det(model, inputs, out_bij)
                              ^
./models/model_utils.py:226:76: F821 undefined name 'num_units'
                         'multiple of group_size({})'.format(num_channels, num_units))
                                                                           ^
./models/invertible_layers.py:181:26: F821 undefined name 'Conv2dZeroInit'
        self.conv_zero = Conv2dZeroInit(c // 2, c, 3, padding=(3 - 1) // 2)
                         ^
./models/invertible_layers.py:187:16: F821 undefined name 'gaussian_diag'
        return gaussian_diag(mean, logs)
               ^
./models/invertible_layers.py:215:21: F821 undefined name 'NN_actnorm'
          self.NN = NN_actnorm(H, W, in_channels=num_features // 2, hidden_channels=width)
                    ^
./models/invertible_layers.py:217:21: F821 undefined name 'NN_layernorm'
          self.NN = NN_layernorm(H, W, in_channels=num_features // 2, hidden_channels=width)
                    ^
./models/invertible_layers.py:219:21: F821 undefined name 'NN_batchnorm'
          self.NN = NN_batchnorm(H, W, in_channels=num_features // 2, hidden_channels=width)
                    ^
./models/invertible_layers.py:237:21: F821 undefined name 'NN_actnorm'
          self.NN = NN_actnorm(H, W, in_channels=num_features // 2, hidden_channels=width, channels_out=num_features)
                    ^
./models/invertible_layers.py:239:21: F821 undefined name 'NN_layernorm'
          self.NN = NN_layernorm(H, W, in_channels=num_features // 2, hidden_channels=width, channels_out=num_features)
                    ^
./models/invertible_layers.py:241:21: F821 undefined name 'NN_batchnorm'
          self.NN = NN_batchnorm(H, W, in_channels=num_features // 2, hidden_channels=width, channels_out=num_features)
                    ^
./models/invertible_layers.py:250:22: F821 undefined name 'flatten_sum'
        objective += flatten_sum(torch.log(scale))
                     ^
./models/invertible_layers.py:261:22: F821 undefined name 'flatten_sum'
        objective -= flatten_sum(torch.log(scale))
                     ^
19    F821 undefined name 'full_fname'
19

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.

  • F821: undefined name name
  • F822: undefined name name in __all__
  • F823: local variable name referenced before assignment
  • E901: SyntaxError or IndentationError
  • E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

There is no argument called svdclipping

I have tried to run your program by using the command script and I met the following issue.

  1. You do not provide the argument svdclipping in your code. (CIFAR_main.py 241 line)

However, you try to build your model based on this argument on several points. After I carefully checked your code, I found that you just pass this argument several times, but it was not actually used in your model.

About Re-producing the dynamics of ResNet and i-ResNet in Figure 1

Hi,

I'm trying to reproduce the phenomenon in Figure 1. However I got some confusion. As you demonstrated, networks in Fig. 1 map the interval [-2,2] to noisy x^3. As [-2,2] -> x^3 has merely one dimension, ResNets require 3 dimensions input. I wonder how they map that interval. If the dynamics are mapping the outputs of residual blocks, however, the outputs have different sizes due to the downsampling. In brief, my question is about how you did that mapping operation.

Mant thanks,
Z. L

Error of Inverse Result is Large?

I use your command script to run a classification model and meet these 2 issues.

  1. When the model hasn’t been trained, I test its inverse function. And the error of a (3x32x32 sized) picture is only about 0.001 when running 20 inverse iterations.

    Then I try to load the model after 1 epoch, the reconstruction error is suddenly about 5.

    I load the model after 50, 150, 200 epochs, but none of them can match the untrained model’s inverse error. After 200 epochs, for a (3x3x32 sized) picture, the smallest error is about 0.95.

  2. When I use inverse iterations on the trained model, the reconstruction error rises when I use more inverse iterations. It’s strange because I think the more inverse iterations I use, the less inverse error I will get.

Is this result normal? This problem puzzles me a lot.

Classifier OOM when computing on test set.

Thanks for a great repository, the code works very well, is nicely documented and the overall structure is intuitive.
I found a minor issue which can easily be solved.

The function test(..) computes loss on test set without turning gradient computations off.

https://github.com/jhjacobsen/invertible-resnet/blob/master/models/utils_cifar.py#L194
image

One might think model.eval() turns off gradients, but it does not, see e.g. [1].
Instead, one needs something like

model.eval()
with torch.no_grad(): 
    # code from before 

This does usually not cause OOM, but if one is training multiple classifiers at the same time on the same GPU it does.
This is useful when e.g. repeating experiment to get error bars).

[1] https://discuss.pytorch.org/t/model-eval-vs-with-torch-no-grad/19615

training parameters for SVHN/other datasets

Dear authors,

I am training the model on SVHN with coeff=0.3, after 200 epochs, many singular values are still larger than 1.
Are there any other parameters to be adjusted?

Best,
Liang

a question about using large image to run classification

Hi, thank you so much for providing the code. I want to run it in my own dataset. I just set the batch size to 1 and even use 6 layers with 896*448 image, but still out of memory. Do you know how this error happens? Are there some ways to solve it? Thank you.

Some questions about INN

Thank you very much for sharing the code, I have a few questions that I want to bother you

  1. Glow is structurally reversible, that is, it is a reversible network without training, and your work is structurally irreversible and requires certain training to become a reversible network.

I don’t know if my understanding is correct.

  1. Can your work achieve the reversibility of the MLP network? If you can, can you tell me which part needs to be modified?

Looking forward to your reply !

Visdom shows nothing

Hello,

I start up visdom server with default parameters and when I run your scripts/classify_cifar.sh script, nothing shows up in visdom, except the very last test accuracy point.

I have no proxies configured, just running your code as cloned from the repo

image

Spectral norm causes gradient signal to be lost when sigma exceeding coeff

Something I was struggling with with my own implementation of Gouk's spectral norm is that a spectral normalized layer seems to become stuck once the sigma values reach the coeff.

What I mean by this is:
Take a spectral normalized FC layer with 2 inputs and 1 output, and feed normally distributed random numbers into it, and ask it to maximize the output. This increases the weights until it reaches a sigma of > coeff.

Then take the same layer, feed the normally distributed random numbers into it and ask it to minimize the output. You'd expect this to decrease the weights until it reaches a sigma of 0, but it's sigma starts > coeff, nothing happens! In fact the weights don't receive very much gradient signal at all.

I think this might be because this line:
sigma = torch.dot(u, torch.mv(weight_mat, v))

happens with grad enabled, meaning that the gradient is propagated along this pathway, forcing the sigma to stay at 1.

I have made a notebook to demonstrate this problem, and my 'fix'
Gouk-jhjacobsen.zip

I'm not sure if this is the expected behaviour, I'd have thought this was analogous to the dying ReLU problem, as layers' sigmas become saturated they'd drop out and stop learning, which might be suboptimal.

Question related to the initialization of the model

Dear author of the "i-Res-Net"

I would like to ask what does the purpose of the ignore_lodget parameter? in "CIFAR_main.py" at line 268

with torch.no_grad():
        model(init_batch, ignore_logdet=True)

Also is there an email address where I can ask my questions related to the "i-Res-Net"?

Regards

bits_per_dim

HI! Thank you very much for your code. I have a question about the function bits_per_dim: Why do you want to add an 8?

def bits_per_dim(logpx, inputs):
return -logpx / float(np.log(2.) * np.prod(inputs.shape[1:])) + 8.

Sincerely look forward to your reply!

Wether the Lipschitz constant of Actnorm layer should less than 1

Hello, this is a great work! But I have a question:
To ensure the network reversible,Lipschitz constant must less than 1, and you divide the spectrum norm of conv and fc layer. But, in Actnorm layer, the Lipschitz constant are not linited to less than 1.
Wether it should be limited as the conv and fc layer?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.