Code Monkey home page Code Monkey logo

rbm-bas-tf's Introduction

Introduction to Restricted Boltzmann Machines using Tensorflow

Material

The notebook tutorials reconstructs some digits doing unsupervised learning and some dummy datasets. The next phase is to extend it to Deep Boltzmann Machines and Deep Belief Networks. There are two other implementations of RBMs that I wrote, one in pure python using numpy and one in Matlab. I prefer Matlab.

TODO:

  • Ideally, this should be seamlessly integrated with Tensorflow's optimization libraries. Especially for DBM where Variational Inference techniques are already written in Tensorflow.
  • Add Annealed Importance Sampling to test the model
  • Print in Tensorboard the weight images after training correctly
  • Explain better the Gaussian approach and the relation to Gaussian Mixture Models

There is also an accompanying presentation I gave for my group at ENS. The second part of the presentation has some derivations for classic RBMs and the Contrastive Divergence algorithm.

Installation notes

I suggest to create a special environment for any Tensorflow related work using Anaconda. All dependencies get automatically installed (i.e. Python/Jupyter/numpy)

conda create --name tensorflow-env
source activate tensorflow-env
jupyter notebook

To use Tensoboard for visualization:

tensorboard --logdir="/path/to/logs"

Tensorflow 1.0.0

Python 3.5.2

Matplotlib 2.0.0

Numpy 1.12.0

Jupyter Notebook

Outline

Restricted Boltzmann Machines are a class of undirected probabilistic graphical models of joint probability distributions (Markov Random Fields), where the nodes in the graph are random variables. The latter are well known and extensively studied in the physics literature, with the ferromagnetic Ising spin model from statistical mechanics being the best example. Atoms are fixed on a 2-D (or 1-D) lattice and neighbours interact with each other. We consider the energy associated with a spin state of +/- 1 and we are interested in the possible states the system takes. It turns out that the joint probability of such a system is modelled by the Boltzmann (Gibbs) distribution.

Similarly, the joint probability of a restricted boltzmann machine can be modelled by the gibbs distribution. Furthermore, an RBM can be considered a stochastic neural network where you have a set of visible nodes that take some data as input and a set of hidden nodes that encode a lower dimensional representation of that data. Because you can think of your input as a high dimension probability distribution, the goal is to learn the joint probability of the ensemble (visible-hidden).

RBM Model

The model is defined in the rbm folder, together with methods for computing the probabilities and free energy of the system as well as sampling. The goal is to learn the joint probability distribution that maximizes the probability over the data, also known as likelihood. RBM Energy RBM Likelihood

Binary / Gaussian RBM on BAS(bars-as-stripes) dataset

The BAS dataset is a dummy dataset that consists of a n by n dataset of binary values where rows have either 1 or 0. Same for columns. For a 4 by 4 dataset you would have 32 options. We use a binary and gaussian RBM (hidden units are gaussian not binary) to try and reconstruct the input as well as partial input with 16 hidden units for the 4 by 4 case.

Training the binary RBM for 3000 epochs we see it reconstructs partial input with 70% accuracy. Training the gaussian RBM is slightly better for 1000 epochs with 86% accuracy.

MNIST reconstruction

The other two notebooks show how to use the RBM for learning a lower dimensional representation of the MNIST dataset. You can see the reconstructions in both cases and how it's slightly better in the gaussian scenario.

Bare in mind this is the simplest example of RBM that uses Contrastive Divergence 1 (only 1 step of MCMC simulation) without weight cost or temperature [Tieleman 08]. Of course there are better performing variants of the model.

rbm-bas-tf's People

Contributors

patricieni avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.