Code Monkey home page Code Monkey logo

differentiableariel's People

Watchers

 avatar  avatar  avatar  avatar

differentiableariel's Issues

Decisions

Decisions made measuring disentanglement:

Decisions made DAriA:

[24/02/2019]

  • tensorboard to check the gradients
  • gradients close to zero
    • adding gaussian noise inside, works in tests, but it doesnt in the main code, it is probably in conflict with the latent dimension, the embedding dimension, the vocabSize or a combination of those since it gives back nan indeces. This must be pointing to something problematic: FIXME
    • instead of mse try mae, since the perfect reconstruction of my system makes mse give back a zero gradient. However mae gave zero gradients as well.
    • check if the gradient different from zero is enough to learn a little about the grammar. It seemed so by eye, gradient histogram became a gaussian centered on zero, and for 1000 epochs the loss diminished abruptly after 112 epochs. Still the sentences generated with noise inside are very unsatisfactory.
  • noise inisde solved the gradients only zero problem, but some finetuning is necessary, since it works only for lucky choices of vocabDim, GaussianNoise std, batchSize, when it needs to be structurally solid
    • check the decoder when the input is larger than [0,1], since that might be what is happening when
      the noise is added: is the clipping working properly? now it works properly, since it was clipped between [0,1], but needed to be [0+eps, 1-eps].
    • SelfAdaptiveGaussianNoise Layer created but it doesn't solve my problem.
  • crossentropy
    • guided by the fact that here they prove that minimizing crossentropy you get for free minimality, sufficiency, invariant to nuissances (and maximal disentanglement?) and sgd will take care of not overfitting. So I assume the practical consequences of CE are more powerful than the consequences of MSE.
    • now it's learning! very good! but it still doesn't distill completely the knowledge in the data and keeps softmaxes ready to learn other things. I would like those softmaxes to collapse completely into almost sure distributions. The generative side is not bad but it has to improve.
    • a GaussianNoise in the latent space seems to improve learning
  • I've approximated Wasserstein distance loss, by having as an ouput softmax but using mse instead of cce, and it seems to work.

Close future

Far future

  • to learn recurision, allow a token that makes the rnn state to reinitialize, so it can start from the beginning of the tree when chosen
  • find a way to let the net decide the optimal noise to learn the grammar
  • compare to ODE net, torch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.