Code Monkey home page Code Monkey logo

morph.py's Introduction

A crafty implementation of Google's MorphNet (and derivative iterations) in PyTorch.

This API is undergoing wild changes as it approaches release.

  • Almost every change will be a breaking change.
  • Do not rely on the functions as they currently are

Please feel free to look around, but bookmark which release tag it was. Master will be changing, viciously.

It is recommended that you consult the current working branch for a more realistic view of the codebase


Update: State of the Project, May 2019

With the recent hype around MorphNet (thanks to Google's blog post) I've received emails and new GitHub issues about this project's usability. Allow me to address those here.

  1. This project was started in late-January 2019 to serve as a toolkit to MorphNet's functionality
    • It was a part of my research fellowship, but unfortunately faded to the background.
    • As such, most of the automatic network rearchitecting isn't currently in this codebase.
    • For the next two months (May, June 2019) I will be intermittently incorporating my private usage of the tools publicly available herein, that achieve the hype of the paper
  2. This project is intended to be a clear and legible implementation of the algorithm
    • After seeing the original codebase, I was dismayed at how difficult the code was to follow
      • Admittedly, I'm not the world's expert on Tensorflow/PyTorch code.
      • But I know clever code isn't instructive
    • As such, I wanted a simpler approach, that someone could step through and at any point trace the code back to the paper's ideas
    • This would allow the simplicity of the algorithm to be simply manifest to the user
  3. Open-Source != "Will work for free"
    • Contributions are preferable to complaints. I can more easily take 20 minutes to read your PR than I can write a mission statement (like this one) or fend off gripes-by-email.
      • Instead of emailing me, create a descriptive issue of what you want - with an idealized code example - or of what is broken.
    • We have the seeds of a conversation started on how to make Morph.py ready for primetime, but until then please be patient!
    • Hindsight is indeed 20-20: had I known certain opportunities wouldn't pan out, or that Google would drop a bomb in my lap, I wouldn't have been caught off guard for the rush of new users.

If you want to submit a PR for a decent issue/complaint/feature request template, that would be much appreciated.

Thank you for your patience and I hope you enjoy what I have here for you at present.


Understanding MorphNet

A Stephen Fox endeavor to become an Applied AI Scientist.

Background Resources

Key Ideas

  1. Make it simple to refine neural architectures
  2. Focus on dropping model parameter size while keeping performance as high as possible
  3. Make the tools user-friendly, and clearly documented

Project Roadmap


Usage

Installation

pip install morph-py

Code Example: Using the tools provided

The following code example is the toolkit use case.

  • This is not the auto-magical, "make my network better for free" path
  • This is how you could manually pick-and-choose when morphing happened to your network
    • Maybe you know that there's a particularly costly convolution
    • Or you know that your RNN can hit the exploding gradient problem around epoch 35
    • Or any other use case (FYI: the above can [maybe should] be solved by other means)
import morph

morph_optimizer = None
# train loop
for e in range(epoch_count):

  for input, target in dataloader:
    optimizer.zero_grad() # optional: zero gradients or don't...
    output = model(input)

    loss = loss_fn(output, target)
    loss.backward()
    optim.step()


    # setup for comparing the morphed model
    if morph_optimizer:
      morph_optimizer.zero_grad()
      morph_loss = loss_fn(morph_model(input), target)

      logging.info(f'Morph loss - Standard loss = {morph_loss - loss}')

      morph_loss.backward()
      morph_optimizer.step()


    # Experimentally supported: Initialize our morphing halfway training
    if e == epoch_count // 2:
      # if you want to override your model
      model = morph.once(model)

      # if you want to compare in parallel
      morph_model = morph.once(model)

      # either way, you need to tell your optimizer about it
      morph_optimizer = init_optimizer(params=morph_model.parameters())
      

Code Example: Automatic morphing of your architecture

TODO

Notes:

  • This is more like what Google promised regarding improved performance.
  • This project focuses on model size regularization and avoids FLOPs regularization.
# TODO: Code example of the dynamic, automatic morphing implementation

Setup (to work alongside me)

git clone https://github.com/stephenjfox/Morph.py.git

Requisites

  • They've made it easier with the years. If you haven't already, please give it a try

Install Pip

  1. conda install pip
  2. Proceed as normal

Dependencies

  • Jupyter Notebook
    • And a few tools to make it better on your local environment like nb_conda, nbconvert, and nb_conda_kernels
  • Python 3.6+ because Python 2 is dying
  • PyTorch (conda install torch torchvision)

morph.py's People

Contributors

stephenjfox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

morph.py's Issues

Widen: Uniform widening

This has already been implemented, in the form of resize_layers. This ticket includes the following:

  • Rename the function with improvement to the documentation
  • Unit test the function
  • Integration test "..."
  • Order with the "layer fitting" segment

Should those all be completed, especially after issues #4 and #8, we are ready to ship

Utils are too essential?

Could these be better named, or segregated?
It may not be evident yet (as there's not a ton of code) but if every piece of the project is using these tools, there may be:

  1. A better name for them
  2. A better namespace for them
  3. A better way to expose them to the user, an part of the unifying mission of giving a toolkit, not just a "catch all" algorithm

Widen: Layer fitting

If the layers done fit together snuggly - the dimensions of their dimensions don't align - we reshape them to better fit together.

  • This is done by resizing the next layer's input dimension to match the preceding layer's output dimension
    • Any other implementation would be eschewing the weights and just confusing the mathematics.

Shrink Pt. 2: Prune/Clip

Once we've detected all of the sparse neurons in the network, we're going to clip them out.

Make that happen in a sensible fashion

Doesn't support multi-headed output architectures

Lying somewhere between a bug and a feature enhancement, the current iteration of the nn.Module.children assumes that the layers are in a sequence (like a significantly deep convolutional network).

It does not however, consider that the children it encounters are all siblings, rather than always being parent-child relationships.

  • Given how nn.Module represents its layer relationships, crafting a solution for many cases for this could be very tricky.

Some architectures that may be negatively impacted by this:

  • U-Net
  • DenseNets (mentioned in the presentation)
  • Any network that trains on multiple outputs
    • Depending on how its modeled, these could all be considered sequential children or a nested output.

how can I use this?

It's awesome you implemented this in pytorch.
I am really not sure how I can use this on my net.
I see there's a demo.py but unless I'm missing something obvious it doesn't really show how to use this package. Why are you running morph once if it doesn't do anything?
Thanks,
Dan

Shrink Pt. 1: Sparsify

  • Prove that sparsify(tensor, threshold) performs as intended
  • Prove shrink on a given layer produces the correct count of neurons for a given layer
    • ... for all layers of a given nn.Module

I recommend an internal encapsulation to handle this relaying of information.

  • Simple: a named tuple for carrying the data
  • Complex: a small control class that makes focused decisions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.