Code Monkey home page Code Monkey logo

vaes's Introduction

To train:

python train.py [-h] (--basic | --nf | --iaf | --hf | --liaf) [--flow FLOW]

# E.g.
python train.py --basic
python train.py --nf --flow 10

Notes

  • All the models (IAF, NF, VAE etc) are in models.py.

  • All the neural net functions are in neural_networks.py.

  • All the loss functions are in loss.py

  • The train function accepts an encoder and a decoder. This is where you get to specify the type of encoder/decoder

  • An encoder takes in (x, e) and spits out z.

  • A decoder takes in z and spits out x

  • To implement a new encoder F with hyperparameters W, we define a hidden function _F_encoder(x, e, W) and then set F_encoder(W) = lambda x, e: _F_encoder(x, e, W). Usually W comprises a neural network, flow lengths and so on. The method for decoders is identical. This allows us to define completely generic encoders and decoders with arbitrary structures and hyperparameters.

vaes's People

Contributors

isaachenrion avatar wellecks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

vaes's Issues

normalizing flow loss function issue

Hi,

In paper https://arxiv.org/pdf/1505.05770.pdf, equation 15, loss function is F(x), and we need to minimize this F(x).

The equation 15 after some transformation, first term is KL divergence (positive), senond term is reconstruct_loss(negative), third term is log_jacobian_det (negative), is my understanding correct?

So I could not understand your loss function in code, why your KL is negative and reconstruct_loss is positive?

Regards,
Wei

ELBO function

Hi,

The Elbo function returns monitor_functions, a dictionary of elements to monitor.

In train.py and evaluation.py, you call elbo_loss, which returns monitor_functions. So far so good. In train.py, you call optimizer.minimize(loss_op), where loss_op is the return value to the elbow function
(line 259 in train.py). minimize() should take the function to be minimized as argument.

Perhaps there is a better explanation for how the code is written since it is unlikely you could get the code to work if this is an error.

I just realized that the code calls train() and not train_simple(). The issue I mention above is in train_simple(). I assume it is an error?

Thank you.

IAF is now working... I think

I can train IAF and it gets c. 120 loss. The images are a little blurry but are definitely numbers.

I went back to your method of using an autoencoder for each set of flow parameters.

The structure is like this. Suppose we are using one step of IAF, so we need to generate a single mu and a single sigma that are autoregressive.

We have a neural net F operating on x to produce an intermediate hidden representation h (this is not z, NB):

h = F(x)

Then we apply another fully connected layer G to get an initial mu_0. (Sigma is the same, so I'll leave it out.)

mu_0 = G(h) = G(F(x)) = mu_0(x)

Now MADE kicks in. We use the single-layer autoencoder of MADE to transform mu_0 --> Enc(mu_0) --> mu = Dec(Enc(mu_0)). This last value is the actual flow parameter mu.

Since we used MADE to transform mu_0 to mu, we will be guaranteed that mu itself is autoregressive. Do the same for sigma.

If we need K steps of IAF, then we transform the same mu_0 using K single-layer MADEs to mu_1, ..., mu_K. Note that these mu are all decoupled from one another, and depend on each other only through their shared initial representation mu_0(x).

Adding convnets and getting onto the GPU

We should really be using convnets...
My computer can't really handle them though. I have a LeNet-style encoder training (and doing a good job!) but 20x slower than the single hidden layer encoder.

UnlabelledDataSet: self.images

Hi,

Yet, in train(), you call:

      feed_dict[x], feed_dict[x_w] = training_data.next_batch(batch_size, whitened=False)

(my line numbers no longer correspond to yours). Note that in UnlabelledDataSet, the last few lines are:

    if whitened:
        return self.images[start:end], self._whitened_images[start:end]
    else: return self._images[start:end], self.images[start:end]

whitened is false, and self.images is not defined. Why does this work? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.