y0ast / vae-torch Goto Github PK
View Code? Open in Web Editor NEWImplementation of Variational Auto-Encoder in Torch7
License: MIT License
Implementation of Variational Auto-Encoder in Torch7
License: MIT License
In GaussianCriterion.lua the gradient with respect to log(sigma^2) is computed as:
self.gradInput[2] = torch.exp(-input[2]):cmul(torch.add(target,-1,input[1]):pow(2)):mul(-1):add(0.5)
But it seems to me that the multiplication should be by -0.5, not -1, such that
self.gradInput[2] = torch.exp(-input[2]):cmul(torch.add(target,-1,input[1]):pow(2)):mul(-0.5):add(0.5)
which would also be consistent with the expression in your comment (after flipping the sign for neg. log likelihood): - 0.5 + 0.5 * (x - mu)^2 / sigma^2
I am seeing a gradient check fail with the original code but it passes after the change.
I have a question I haven't been able to clear up anywhere else.
The only inputs to the KLD criterion are the mean and variance vectors, the outputs of log_var
after the encoder. These are just outputs of two linear layers. They are the mean and variance of the prior p(Z).
Isn't the KL divergence term supposed to measure the divergence between p(Z) and q(Z|X)? How is this estimated by only having the definition of the prior, i.e. the mean and variance of p(Z)?
The lines of code that I'm talking about are:
From main.lua:
local mean, log_var = encoder(input):split(2)
local z = nn.Sampler()({mean, log_var})
...
local KLDerr = KLD:forward(mean, log_var)
Above, KLD only looks at mean
and log_var
. These are both summary statistics of the prior, and are both outputs of the linear layers proceeding the encoder's output. What is the interpretation where KLD also considers the posterior q(Z|X)?
envy@ub1404:/os_pri/github/VAE-Torch$ th main.lua/os_pri/github/VAE-Torch
[======================================== 499/499 ====================================>] Tot: 1m8s | Step: 135ms
Epoch: 1 Lowerbound: 165.41410384644 time: 68.586935043335
[======================================== 499/499 ====================================>] Tot: 1m13s | Step: 174ms
Epoch: 2 Lowerbound: 122.62024767955 time: 73.759642124176
/home/envy/torch/install/bin/luajit: cannot open <save/parameters.t7> in mode w at /tmp/luarocks_torch-scm-1-8096/torch7/lib/TH/THDiskFile.c:640
stack traceback:
[C]: at 0x7f1619f91110
[C]: in function 'DiskFile'
/home/envy/torch/install/share/lua/5.1/torch/File.lua:385: in function 'save'
main.lua:130: in main chunk
[C]: in function 'dofile'
...envy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406260
envy@ub1404:
Hi,
I am trying to follow DKingma/MWelling's paper at http://arxiv.org/pdf/1312.6114v10.pdf and your torch code, I am following everything in the code except that I am getting lost at use of exp in the Reparmeterization updateOutput and UpdateGradInput functions. I did not find any mention of using exponential in the paper or other implementations of VAE (in Theano/PyLearn2). Also is the fill of 0.5 just a arbitrary parameter value or is there any specific reason for it.
if torch.typename(input[1]) == 'torch.CudaTensor' then
self.eps = self.eps:cuda()
self.output = torch.CudaTensor():resizeAs(input[2]):fill(0.5)
else
self.output = torch.Tensor():resizeAs(input[2]):fill(0.5)
end
BTW, I have reviewed 5 different implementations of Variational AutoEncoder,your torch implementation is the most concise and clear.
Thank You
Thanks for this project, it clearly shows how to implement VAE in torch.
I ran 'main.lua' to train a model, then I tried python plot.py
, but I got following error:
IOError: Unable to open file (Unable to open file: name = 'params/ff_epoch_740.hdf5', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0)
Is plot.py the file that generate image? If so, how can I run this file? Thanks!
After doing criterion.sizeAverage = true I realized the KLD criterion gives an output of ~0 after epoch 2-3 constantly and the reconstructions are identical and do not make sense at all. I tried with very very small learning rates and also bigger learning rates and I still face the same issue. Why is that?
Hey I'm downloaded your encoder to use.
Trying to run it, the load.lua function tries to "require" pl.data which you don't explicitly ask for, I traced the problem to function 'loadBinarizedMNIS'(lines 40-55).
I think you got confused between dataset and data when defining new variables. Maybe you can check if thats okay?
Sorry in advance if this confusing.
self.gradInput[2] = torch.exp(-input[2]):cmul(torch.add(target,-1,input[1]):pow(2)):add(-0.5)
It seems to me that gradient updating step should be:
self.gradInput[2] = torch.exp(-input[2]):cmul(torch.add(target,-1,input[1]):pow(2)):mul(0.5):add(-0.5)
Hi, maybe loss should be divided by batch size, the learning wont be affected but since all other criterions do that, it might be done.
https://github.com/y0ast/VAE-Torch/blob/master/KLDCriterion.lua#L14
self.output = -0.5 * torch.sum(KLDelements) / input:size(1)
Hi,
Sorry if the question sound too naive but I am getting positive lowerbound which I believe shouldn't be the case? I am using my own data (ECG, 1200 dimension). So I would be grateful if you could point me out the possible scenario. Thanks
not done yet after 10 days on a intel i5820
In algorithm 2 and eqn 24 of the appendix of their paper Kingma & Welling give the details for the estimator of the full VAE.
Could I request this to be implemented to enhance the performance of the package? I'm not aware that there are any torch packages which have done this yet?
Since the Adam optimizer is now available for the optim
package, it should be possible to obtain the published negLL of approx 87, in the paper of Rezende, Mohamed & Wierstra.
Thanks for the code. I am just confused about the variable batchlowerbound in main.lua.
Why batchlowerbound = reconstruc_err + KLDerr, instead of -KLDerr + E_q[p(x|z)] ?
This seems to be the error induced when maximising the lower bond?
I'm trying to use the implementation of the adam optimizer available in the optim
package with the line
-- not used -- x, batchlowerbound = optim.adagrad(opfunc, parameters, config, state)
x, batchlowerbound = optim.adam(opfunc, parameters, config, state)
It does not seem to converge or even change much with any of the configurations I've tried.
Could you suggest a config
which works please?
Thank you.
Hi Joost,
I noticed you recently changed the MNIST dataset to a new version. What are the differences?
Thanks,
Yunqing
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.