nicola-decao / bnaf Goto Github PK
View Code? Open in Web Editor NEWPytorch implementation of Block Neural Autoregressive Flow
Home Page: http://arxiv.org/abs/1904.04676
License: MIT License
Pytorch implementation of Block Neural Autoregressive Flow
Home Page: http://arxiv.org/abs/1904.04676
License: MIT License
The forward pass of the BNAF class should be changed to the following (or similar) to properly handle the univariate case or calling model will reduce the gradients (grad) to a single number when it should have a first dim size of batch_size. I confirmed the following will work though.
```
grad = grad.squeeze()
reduce_sum = len(grad.shape) > 1
if reduce_sum:
if self.res == 'normal':
return inputs + outputs, torch.nn.functional.softplus(grad.squeeze()).sum(-1)
elif self.res == 'gated':
return self.gate.sigmoid() * outputs + (1 - self.gate.sigmoid()) * inputs, \
(torch.nn.functional.softplus(grad.squeeze() + self.gate) - \
torch.nn.functional.softplus(self.gate)).sum(-1)
else:
return outputs, grad.squeeze().sum(-1)
else:
if self.res == 'normal':
return inputs + outputs, torch.nn.functional.softplus(grad)
elif self.res == 'gated':
return self.gate.sigmoid() * outputs + (1 - self.gate.sigmoid()) * inputs, \
(torch.nn.functional.softplus(grad + self.gate) - \
torch.nn.functional.softplus(self.gate))
else:
return outputs, grad
In the BNAF-->maskedweights-->get_weights function, there is a line as follows...
w = torch.exp(self._weight) * self.mask_d + self._weight * self.mask_o
I believe the torch.exp(self._weight) * self.mask_d
should be torch.exp(self._diag_weight) * self.mask_d
though right?
Hi @nicola-decao very nice work! I am thinking of using BNAF to do variational inference where the posterior is over a few thousand to tens of thousands dimensional space. I wonder if the current implementation can scale up to that many dimension. My concern is that the model might not fit into the GPU memory. Can you provide an estimate of the space complexity a given architecture consisting of, say, n
stacked flows of m
hidden layers each? I know you gave an estimate of number of parameters in table 2 in the paper but how does that translate into memroy requirement? I appreciate your insight in this because I am more of a tensorflow person so trying this out in pytorch will likely take me a while. Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.