Code Monkey home page Code Monkey logo

to_share_or_not_to_share's People

Contributors

apourchot avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

to_share_or_not_to_share's Issues

Forget to write back the batchnorm statistics while finetuning the bn layer

I really like the idea that you should finetune the bn layers before evaluating the model. I find you try to finetune the bn layers of the sampled model in function ft_bn_stats. However, it looks like you forget to write the 'bn_means' and 'bn_vars' back to the model, in the current case, due to the momentum is set to 1.0, so only the mean&var of the final batch is written into the model rather than the average values of the four batches.

`
def ft_bn_stats(model, task, train_iter, device, config):

model.train()  # in order to update bn stats
model.set_bn_momentum(1.0)  # so no data leaks from one batch to the other

# compute bn stats with given task
bn_means = {n: torch.zeros_like(m.running_mean) for (n, m) in model.named_modules() if "BatchNorm" in str(type(m))}
bn_vars = {n: torch.zeros_like(m.running_var) for (n, m) in model.named_modules() if "BatchNorm" in str(type(m))}

# we don't need gradients
with torch.no_grad():
    for n, (x_t, y_t) in enumerate(train_iter):

        if n < config["EVAL"]["n_ft_bn_stats"]:
            x_t, y_t = x_t.to(device), y_t.to(device)
            model.forward(x_t, task)

            for name, module in model.named_modules():
                if "BatchNorm" in str(type(module)):
                    bn_means[name] += module.running_mean / config["EVAL"]["n_ft_bn_stats"]
                    bn_vars[name] += module.running_var / config["EVAL"]["n_ft_bn_stats"]

        else:
            break

`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.