Code Monkey home page Code Monkey logo

Comments (8)

jacobjma avatar jacobjma commented on June 3, 2024 1

Good to know that it is not a fundamental issue. The solution you suggested seems great; I would be very thankful if you make those updates.

My users are mainly scientists, so I don't need initialization to be super fast, just a bit faster would be great. I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.

Best,
Jacob

from e2cnn.

Gabri95 avatar Gabri95 commented on June 3, 2024 1

He weight initialisation now supports caching of the variances tensor, have a look here!

Hope this helps!

Gabriele

from e2cnn.

Gabri95 avatar Gabri95 commented on June 3, 2024

Hi @jacobjma

Thank you for a very well-crafted and important package. I have been using this package to test a rotation-equivariant UNet type network. I already see VERY good results; the standard UNet has 92 % accuracy on our dataset, whereas the most naive rotation-equivariant implementation achieves 99 %(!)
Thanks for sharing this, I am very happy to hear it!

Good question! Thanks for opening this issue, this is indeed an annoying problem now.
I have experienced the same and when you use very wide networks it gets even more annoying.
The problem comes from the weight initialization.
All conv layers automatically perform a form of He's weight initialization while instantiating them.
Unfortunately, the code which does this initialization is quite inefficient (you can see it here).
This was not a problem for my experiments as this code is run only at initialization and does not affect the run time.
I will update the code for the moment to make the automatic initialization optional with a flag. This already saves some time when you don't need to perform initialization (e.g. if you want to load the weights you have stored before).
If I understand well, this should also be what you are interested in, right?

Best,
Gabriele

from e2cnn.

Gabri95 avatar Gabri95 commented on June 3, 2024

Here it is! :)

Now, weight initialization is still done by default (I thought it was better for backwards compatibility) but you can set the parameter initialize = False when you instantiate R2Conv to skip it.

I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.

That is interesting indeed! I would be very curious to know how you use our library and if it works well for this task

Thanks again for your feedback and, please, let us know if you encounter any issues or you have any suggestions!

Best,
Gabriele

from e2cnn.

drewm1980 avatar drewm1980 commented on June 3, 2024

Jacob I'm envious of your 16 seconds :) I also implemented a UNet, and it's taking about 3 minutes to initialize. I profiled it, if it helps (attached).

slow_model_initialization.prof.zip

There are python functions getting called tens of millions of times; you're probably doing something in python for every pixel.

I'll try the workaround to switch off initialization for inference.

Thanks! Andrew

from e2cnn.

drewm1980 avatar drewm1980 commented on June 3, 2024

I passed initialize=False to all of my convolution layers, and now initialization time is down to 13 seconds.

I looked at the profile, and str.format() and list.append() are still getting called 12 million times in convolution init. Something about _compute_attrs_and_ids

Profile attached:

slow_model_initialization.prof.zip

from e2cnn.

EBGU avatar EBGU commented on June 3, 2024

Here it is! :)

Now, weight initialization is still done by default (I thought it was better for backwards compatibility) but you can set the parameter initialize = False when you instantiate R2Conv to skip it.

I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.

That is interesting indeed! I would be very curious to know how you use our library and if it works well for this task

Thanks again for your feedback and, please, let us know if you encounter any issues or you have any suggestions!

Best,
Gabriele

Hi!
I am building a wide resnet 50 with this package and the initializing could take 15min for me! I wonder if there could be faster-initializing methods? Thank you a lot!

from e2cnn.

Gabri95 avatar Gabri95 commented on June 3, 2024

Hi @EBGU

I am sorry for the late reply, I missed this comment completely :(

So far, there is not much one can do.
I will add some caching in the next release to speed up the He weight init.
If you check its implementation here, the delay comes from the computation of vars.
This computation only depends on basisexpansion. I can add a caching flag that stores on disk the tensor vars the first it is computed and reuses it if the same basisexpansion (i.e. the same conv layer) is passed. This will not make initialisation faster the first time, but if you plan to rerun the same architecture multiple times (e.g. for hyperparameter search), this would save you quite some time.

Best,
Gabriele

from e2cnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.