Code Monkey home page Code Monkey logo

memory-efficient-maml's Introduction

Memory Efficient MAML

Overview

PyTorch implementation of Model-Agnostic Meta-Learning[1] with gradient checkpointing[2]. It allows you to perform way (~10-100x) more MAML steps with the same GPU memory budget.

Install

For normal installation, run pip install torch_maml

For development installation, clone a repo and python setup.py develop

How to use:

See examples in example.ipynb

Open In Colab

Tips and tricks

  1. Make sure that your model doesn't have implicit parameter updates like torch.nn.BatchNorm2d under track_running_stats=True. With gradient checkpointing, these updates will be performed twice (once per forward pass). If still want these updates, take a look at torch_maml.utils.disable_batchnorm_stats. Note that we already support this for vanilla BatchNorm{1-3}d.

  2. When computing gradients through many MAML steps (e.g. 100 or 1000), you should care about vanishing and exploding gradients within optimizers (same as in RNN). This implementation supports gradient clipping to avoid the explosive part of the problem.

  3. Also, when you deal with a large number of MAML steps, be aware of accumulating computational error due to float precision and specifically CUDNN operations. We recommend you to use torch.backend.cudnn.determistic=True. The problem appears when gradients become slightly noisy due to errors, and, during backpropagation through MAML steps, the error is likely to increase dramatically.

  4. You could also consider Implicit Gradient MAML [3] for memory efficient meta-learning alternatives. While this algorithm requires even less memory, it assumes that your optimization converges to the optimum. Therefore, it is inapplicable if your task does not always converge by the time you start backpropagating. In contrast, our implementation allows you to meta-learn even from a partially converged state.

References

[1] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

[2] Gradient checkpointing technique (GitHub)

[3] Meta-Learning with Implicit Gradients

memory-efficient-maml's People

Contributors

dbaranchuk avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.