Comments (8)
Good to know that it is not a fundamental issue. The solution you suggested seems great; I would be very thankful if you make those updates.
My users are mainly scientists, so I don't need initialization to be super fast, just a bit faster would be great. I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.
Best,
Jacob
from e2cnn.
He weight initialisation now supports caching of the variances tensor, have a look here!
Hope this helps!
Gabriele
from e2cnn.
Hi @jacobjma
Thank you for a very well-crafted and important package. I have been using this package to test a rotation-equivariant UNet type network. I already see VERY good results; the standard UNet has 92 % accuracy on our dataset, whereas the most naive rotation-equivariant implementation achieves 99 %(!)
Thanks for sharing this, I am very happy to hear it!
Good question! Thanks for opening this issue, this is indeed an annoying problem now.
I have experienced the same and when you use very wide networks it gets even more annoying.
The problem comes from the weight initialization.
All conv layers automatically perform a form of He's weight initialization while instantiating them.
Unfortunately, the code which does this initialization is quite inefficient (you can see it here).
This was not a problem for my experiments as this code is run only at initialization and does not affect the run time.
I will update the code for the moment to make the automatic initialization optional with a flag. This already saves some time when you don't need to perform initialization (e.g. if you want to load the weights you have stored before).
If I understand well, this should also be what you are interested in, right?
Best,
Gabriele
from e2cnn.
Here it is! :)
Now, weight initialization is still done by default (I thought it was better for backwards compatibility) but you can set the parameter initialize = False
when you instantiate R2Conv
to skip it.
I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.
That is interesting indeed! I would be very curious to know how you use our library and if it works well for this task
Thanks again for your feedback and, please, let us know if you encounter any issues or you have any suggestions!
Best,
Gabriele
from e2cnn.
Jacob I'm envious of your 16 seconds :) I also implemented a UNet, and it's taking about 3 minutes to initialize. I profiled it, if it helps (attached).
slow_model_initialization.prof.zip
There are python functions getting called tens of millions of times; you're probably doing something in python for every pixel.
I'll try the workaround to switch off initialization for inference.
Thanks! Andrew
from e2cnn.
I passed initialize=False to all of my convolution layers, and now initialization time is down to 13 seconds.
I looked at the profile, and str.format() and list.append() are still getting called 12 million times in convolution init. Something about _compute_attrs_and_ids
Profile attached:
slow_model_initialization.prof.zip
from e2cnn.
Here it is! :)
Now, weight initialization is still done by default (I thought it was better for backwards compatibility) but you can set the parameter
initialize = False
when you instantiateR2Conv
to skip it.I am working on automating the analysis of electron microscopy images if you are curious, mostly within materials science.
That is interesting indeed! I would be very curious to know how you use our library and if it works well for this task
Thanks again for your feedback and, please, let us know if you encounter any issues or you have any suggestions!
Best,
Gabriele
Hi!
I am building a wide resnet 50 with this package and the initializing could take 15min for me! I wonder if there could be faster-initializing methods? Thank you a lot!
from e2cnn.
Hi @EBGU
I am sorry for the late reply, I missed this comment completely :(
So far, there is not much one can do.
I will add some caching in the next release to speed up the He weight init.
If you check its implementation here, the delay comes from the computation of vars
.
This computation only depends on basisexpansion
. I can add a caching flag that stores on disk the tensor vars
the first it is computed and reuses it if the same basisexpansion
(i.e. the same conv layer) is passed. This will not make initialisation faster the first time, but if you plan to rerun the same architecture multiple times (e.g. for hyperparameter search), this would save you quite some time.
Best,
Gabriele
from e2cnn.
Related Issues (20)
- wrapping pytorch operations - grid_sample HOT 4
- Import Error with Torch 1.9.0+cu111 HOT 2
- equivariant Transformer HOT 5
- ZeroPad2D on GeometricTensor
- Cannot pass weights of R2Conv as a positional argument HOT 2
- Counting FLOPs for e2cnn HOT 1
- equivariance in C8 space HOT 1
- Module export HOT 3
- About the equivalence of wide_resnet HOT 5
- Need a size parameter for e2cnn.R2Upsampling Class HOT 1
- about attribute R2conv.filter HOT 2
- Learning of kernels HOT 2
- O(2) group, irreps, and PyTorch DDP. HOT 2
- checking equivariance for the angles that are not 90n HOT 2
- about to set special rotation equivariant HOT 2
- Cannot import name container_abcs in python 3.6 version (e2cnn_py36)
- shriking size of output image
- Use of np.float and np.int etc
- Difference between trivial output type and regular output type with group pooling HOT 1
- Export Linear HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from e2cnn.