The differentiableariel from lucehe

differentiableariel's People

Watchers

differentiableariel's Issues

Decisions

Decisions made measuring disentanglement:

Achille Soatto measure
Higgins on groups
review: https://arxiv.org/pdf/1811.12359.pdf
MIG seems applicable only to VAE variants: https://arxiv.org/pdf/1802.04942.pdf
we use K-means

Decisions made DAriA:

[24/02/2019]

tensorboard to check the gradients
gradients close to zero
- adding gaussian noise inside, works in tests, but it doesnt in the main code, it is probably in conflict with the latent dimension, the embedding dimension, the vocabSize or a combination of those since it gives back nan indeces. This must be pointing to something problematic: FIXME
- instead of mse try mae, since the perfect reconstruction of my system makes mse give back a zero gradient. However mae gave zero gradients as well.
- check if the gradient different from zero is enough to learn a little about the grammar. It seemed so by eye, gradient histogram became a gaussian centered on zero, and for 1000 epochs the loss diminished abruptly after 112 epochs. Still the sentences generated with noise inside are very unsatisfactory.
noise inisde solved the gradients only zero problem, but some finetuning is necessary, since it works only for lucky choices of vocabDim, GaussianNoise std, batchSize, when it needs to be structurally solid
- check the decoder when the input is larger than [0,1], since that might be what is happening when
  the noise is added: is the clipping working properly? now it works properly, since it was clipped between [0,1], but needed to be [0+eps, 1-eps].
- SelfAdaptiveGaussianNoise Layer created but it doesn't solve my problem.
crossentropy
- guided by the fact that here they prove that minimizing crossentropy you get for free minimality, sufficiency, invariant to nuissances (and maximal disentanglement?) and sgd will take care of not overfitting. So I assume the practical consequences of CE are more powerful than the consequences of MSE.
- now it's learning! very good! but it still doesn't distill completely the knowledge in the data and keeps softmaxes ready to learn other things. I would like those softmaxes to collapse completely into almost sure distributions. The generative side is not bad but it has to improve.
- a GaussianNoise in the latent space seems to improve learning
I've approximated Wasserstein distance loss, by having as an ouput softmax but using mse instead of cce, and it seems to work.

Close future

Far future

to learn recurision, allow a token that makes the rnn state to reinitialize, so it can start from the beginning of the tree when chosen
find a way to let the net decide the optimal noise to learn the grammar
compare to ODE net, torch

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

lucehe / differentiableariel Goto Github PK

differentiableariel's People

Watchers

differentiableariel's Issues

Decisions made measuring disentanglement:

Decisions made DAriA:

Close future

Far future

Recommend Projects

Recommend Topics

Recommend Org