differentiableariel's People
differentiableariel's Issues
Decisions
Decisions made measuring disentanglement:
- Achille Soatto measure
- Higgins on groups
- review: https://arxiv.org/pdf/1811.12359.pdf
- MIG seems applicable only to VAE variants: https://arxiv.org/pdf/1802.04942.pdf
- we use K-means
Decisions made DAriA:
[24/02/2019]
- tensorboard to check the gradients
- gradients close to zero
- adding gaussian noise inside, works in tests, but it doesnt in the main code, it is probably in conflict with the latent dimension, the embedding dimension, the vocabSize or a combination of those since it gives back nan indeces. This must be pointing to something problematic: FIXME
- instead of mse try mae, since the perfect reconstruction of my system makes mse give back a zero gradient. However mae gave zero gradients as well.
- check if the gradient different from zero is enough to learn a little about the grammar. It seemed so by eye, gradient histogram became a gaussian centered on zero, and for 1000 epochs the loss diminished abruptly after 112 epochs. Still the sentences generated with noise inside are very unsatisfactory.
- noise inisde solved the gradients only zero problem, but some finetuning is necessary, since it works only for lucky choices of vocabDim, GaussianNoise std, batchSize, when it needs to be structurally solid
- check the decoder when the input is larger than [0,1], since that might be what is happening when
the noise is added: is the clipping working properly? now it works properly, since it was clipped between [0,1], but needed to be [0+eps, 1-eps]. - SelfAdaptiveGaussianNoise Layer created but it doesn't solve my problem.
- check the decoder when the input is larger than [0,1], since that might be what is happening when
- crossentropy
- guided by the fact that here they prove that minimizing crossentropy you get for free minimality, sufficiency, invariant to nuissances (and maximal disentanglement?) and sgd will take care of not overfitting. So I assume the practical consequences of CE are more powerful than the consequences of MSE.
- now it's learning! very good! but it still doesn't distill completely the knowledge in the data and keeps softmaxes ready to learn other things. I would like those softmaxes to collapse completely into almost sure distributions. The generative side is not bad but it has to improve.
- a GaussianNoise in the latent space seems to improve learning
- I've approximated Wasserstein distance loss, by having as an ouput softmax but using mse instead of cce, and it seems to work.
Close future
Far future
- to learn recurision, allow a token that makes the rnn state to reinitialize, so it can start from the beginning of the tree when chosen
- find a way to let the net decide the optimal noise to learn the grammar
- compare to ODE net, torch
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.