nicola-decao / bnaf Goto Github PK

View Code? Open in Web Editor NEW

173.0 9.0 33.0 224 KB

Pytorch implementation of Block Neural Autoregressive Flow

Home Page: http://arxiv.org/abs/1904.04676

License: MIT License

Python 99.67% Shell 0.33%

normalizing-flows neural-networks density-estimation variational-autoencoder

bnaf's People

Contributors

Stargazers

Watchers

bnaf's Issues

Silent error in handling of univariate case

The forward pass of the BNAF class should be changed to the following (or similar) to properly handle the univariate case or calling model will reduce the gradients (grad) to a single number when it should have a first dim size of batch_size. I confirmed the following will work though.

   ```
    grad = grad.squeeze()
    reduce_sum = len(grad.shape) > 1

    if reduce_sum:
        if self.res == 'normal':
            return inputs + outputs, torch.nn.functional.softplus(grad.squeeze()).sum(-1)
        elif self.res == 'gated':
            return self.gate.sigmoid() * outputs + (1 - self.gate.sigmoid()) * inputs, \
                   (torch.nn.functional.softplus(grad.squeeze() + self.gate) - \
                    torch.nn.functional.softplus(self.gate)).sum(-1)
        else:
            return outputs, grad.squeeze().sum(-1)
    else:
        if self.res == 'normal':
            return inputs + outputs, torch.nn.functional.softplus(grad)
        elif self.res == 'gated':
            return self.gate.sigmoid() * outputs + (1 - self.gate.sigmoid()) * inputs, \
                   (torch.nn.functional.softplus(grad + self.gate) - \
                    torch.nn.functional.softplus(self.gate))
        else:
            return outputs, grad

Error in weight normalization code?

In the BNAF-->maskedweights-->get_weights function, there is a line as follows...

w = torch.exp(self._weight) * self.mask_d + self._weight * self.mask_o

I believe the torch.exp(self._weight) * self.mask_d should be torch.exp(self._diag_weight) * self.mask_d though right?

scalability

Hi @nicola-decao very nice work! I am thinking of using BNAF to do variational inference where the posterior is over a few thousand to tens of thousands dimensional space. I wonder if the current implementation can scale up to that many dimension. My concern is that the model might not fit into the GPU memory. Can you provide an estimate of the space complexity a given architecture consisting of, say, n stacked flows of m hidden layers each? I know you gave an estimate of number of parameters in table 2 in the paper but how does that translate into memroy requirement? I appreciate your insight in this because I am more of a tensorflow person so trying this out in pytorch will likely take me a while. Thanks in advance!

nicola-decao / bnaf Goto Github PK

bnaf's People

Contributors

Stargazers

Watchers

Forkers

bnaf's Issues

Silent error in handling of univariate case

Error in weight normalization code?

scalability

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent