rasmusbergpalm / deeplearntoolbox Goto Github PK

Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoencoders and vanilla Neural Nets. Each method has examples to get you started.

License: BSD 2-Clause "Simplified" License

MATLAB 98.69% Shell 1.23% M 0.08%

deeplearntoolbox's Introduction

Deprecation notice.

This toolbox is outdated and no longer maintained.

There are much better tools available for deep learning than this toolbox, e.g. Theano, torch or tensorflow

I would suggest you use one of the tools mentioned above rather than use this toolbox.

Best, Rasmus.

DeepLearnToolbox

A Matlab toolbox for Deep Learning.

Deep Learning is a new subfield of machine learning that focuses on learning deep hierarchical models of data. It is inspired by the human brain's apparent deep (layered, hierarchical) architecture. A good overview of the theory of Deep Learning theory is Learning Deep Architectures for AI

For a more informal introduction, see the following videos by Geoffrey Hinton and Andrew Ng.

The Next Generation of Neural Networks (Hinton, 2007)
Recent Developments in Deep Learning (Hinton, 2010)
Unsupervised Feature Learning and Deep Learning (Ng, 2011)

If you use this toolbox in your research please cite Prediction as a candidate for learning deep hierarchical models of data

@MASTERSTHESIS\{IMM2012-06284,
    author       = "R. B. Palm",
    title        = "Prediction as a candidate for learning deep hierarchical models of data",
    year         = "2012",
}

Contact: rasmusbergpalm at gmail dot com

Directories included in the toolbox

NN/ - A library for Feedforward Backpropagation Neural Networks

CNN/ - A library for Convolutional Neural Networks

DBN/ - A library for Deep Belief Networks

SAE/ - A library for Stacked Auto-Encoders

CAE/ - A library for Convolutional Auto-Encoders

util/ - Utility functions used by the libraries

data/ - Data used by the examples

tests/ - unit tests to verify toolbox is working

For references on each library check REFS.md

Setup

Download.
addpath(genpath('DeepLearnToolbox'));

Example: Deep Belief Network

function test_example_DBN
load mnist_uint8;

train_x = double(train_x) / 255;
test_x  = double(test_x)  / 255;
train_y = double(train_y);
test_y  = double(test_y);

%%  ex1 train a 100 hidden unit RBM and visualize its weights
rand('state',0)
dbn.sizes = [100];
opts.numepochs =   1;
opts.batchsize = 100;
opts.momentum  =   0;
opts.alpha     =   1;
dbn = dbnsetup(dbn, train_x, opts);
dbn = dbntrain(dbn, train_x, opts);
figure; visualize(dbn.rbm{1}.W');   %  Visualize the RBM weights

%%  ex2 train a 100-100 hidden unit DBN and use its weights to initialize a NN
rand('state',0)
%train dbn
dbn.sizes = [100 100];
opts.numepochs =   1;
opts.batchsize = 100;
opts.momentum  =   0;
opts.alpha     =   1;
dbn = dbnsetup(dbn, train_x, opts);
dbn = dbntrain(dbn, train_x, opts);

%unfold dbn to nn
nn = dbnunfoldtonn(dbn, 10);
nn.activation_function = 'sigm';

%train nn
opts.numepochs =  1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);

assert(er < 0.10, 'Too big error');

Example: Stacked Auto-Encoders

function test_example_SAE
load mnist_uint8;

train_x = double(train_x)/255;
test_x  = double(test_x)/255;
train_y = double(train_y);
test_y  = double(test_y);

%%  ex1 train a 100 hidden unit SDAE and use it to initialize a FFNN
%  Setup and train a stacked denoising autoencoder (SDAE)
rand('state',0)
sae = saesetup([784 100]);
sae.ae{1}.activation_function       = 'sigm';
sae.ae{1}.learningRate              = 1;
sae.ae{1}.inputZeroMaskedFraction   = 0.5;
opts.numepochs =   1;
opts.batchsize = 100;
sae = saetrain(sae, train_x, opts);
visualize(sae.ae{1}.W{1}(:,2:end)')

% Use the SDAE to initialize a FFNN
nn = nnsetup([784 100 10]);
nn.activation_function              = 'sigm';
nn.learningRate                     = 1;
nn.W{1} = sae.ae{1}.W{1};

% Train the FFNN
opts.numepochs =   1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.16, 'Too big error');

Example: Convolutional Neural Nets

function test_example_CNN
load mnist_uint8;

train_x = double(reshape(train_x',28,28,60000))/255;
test_x = double(reshape(test_x',28,28,10000))/255;
train_y = double(train_y');
test_y = double(test_y');

%% ex1 Train a 6c-2s-12c-2s Convolutional neural network 
%will run 1 epoch in about 200 second and get around 11% error. 
%With 100 epochs you'll get around 1.2% error
rand('state',0)
cnn.layers = {
    struct('type', 'i') %input layer
    struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %sub sampling layer
    struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %subsampling layer
};
cnn = cnnsetup(cnn, train_x, train_y);

opts.alpha = 1;
opts.batchsize = 50;
opts.numepochs = 1;

cnn = cnntrain(cnn, train_x, train_y, opts);

[er, bad] = cnntest(cnn, test_x, test_y);

%plot mean squared error
figure; plot(cnn.rL);

assert(er<0.12, 'Too big error');

Example: Neural Networks

function test_example_NN
load mnist_uint8;

train_x = double(train_x) / 255;
test_x  = double(test_x)  / 255;
train_y = double(train_y);
test_y  = double(test_y);

% normalize
[train_x, mu, sigma] = zscore(train_x);
test_x = normalize(test_x, mu, sigma);

%% ex1 vanilla neural net
rand('state',0)
nn = nnsetup([784 100 10]);
opts.numepochs =  1;   %  Number of full sweeps through data
opts.batchsize = 100;  %  Take a mean gradient step over this many samples
[nn, L] = nntrain(nn, train_x, train_y, opts);

[er, bad] = nntest(nn, test_x, test_y);

assert(er < 0.08, 'Too big error');

%% ex2 neural net with L2 weight decay
rand('state',0)
nn = nnsetup([784 100 10]);

nn.weightPenaltyL2 = 1e-4;  %  L2 weight decay
opts.numepochs =  1;        %  Number of full sweeps through data
opts.batchsize = 100;       %  Take a mean gradient step over this many samples

nn = nntrain(nn, train_x, train_y, opts);

[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');


%% ex3 neural net with dropout
rand('state',0)
nn = nnsetup([784 100 10]);

nn.dropoutFraction = 0.5;   %  Dropout fraction 
opts.numepochs =  1;        %  Number of full sweeps through data
opts.batchsize = 100;       %  Take a mean gradient step over this many samples

nn = nntrain(nn, train_x, train_y, opts);

[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');

%% ex4 neural net with sigmoid activation function
rand('state',0)
nn = nnsetup([784 100 10]);

nn.activation_function = 'sigm';    %  Sigmoid activation function
nn.learningRate = 1;                %  Sigm require a lower learning rate
opts.numepochs =  1;                %  Number of full sweeps through data
opts.batchsize = 100;               %  Take a mean gradient step over this many samples

nn = nntrain(nn, train_x, train_y, opts);

[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');

%% ex5 plotting functionality
rand('state',0)
nn = nnsetup([784 20 10]);
opts.numepochs         = 5;            %  Number of full sweeps through data
nn.output              = 'softmax';    %  use softmax output
opts.batchsize         = 1000;         %  Take a mean gradient step over this many samples
opts.plot              = 1;            %  enable plotting

nn = nntrain(nn, train_x, train_y, opts);

[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');

%% ex6 neural net with sigmoid activation and plotting of validation and training error
% split training data into training and validation data
vx   = train_x(1:10000,:);
tx = train_x(10001:end,:);
vy   = train_y(1:10000,:);
ty = train_y(10001:end,:);

rand('state',0)
nn                      = nnsetup([784 20 10]);     
nn.output               = 'softmax';                   %  use softmax output
opts.numepochs          = 5;                           %  Number of full sweeps through data
opts.batchsize          = 1000;                        %  Take a mean gradient step over this many samples
opts.plot               = 1;                           %  enable plotting
nn = nntrain(nn, tx, ty, opts, vx, vy);                %  nntrain takes validation set as last two arguments (optionally)

[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');

deeplearntoolbox's People

Contributors

$marcofraccaro avatar$

Stargazers

Watchers

Forkers

frrp howarth chris-taylor wqren seanxsheng jinghow ysongfinance joskid yiiwood tener vgoklani lanzithinking tengkz dmus chenyuefeng scyoyo jokame robert0812 issyuwenchao imnil soufian36 sfoucher mosiac xyztech999 lukastencer mohsenali yinxusen weixelbaumer bruinxiong marcofraccaro jinbochen skaae chtlp breznak jmarinero sdvillal ggjexodus gallamine barapa hzgswangy cslxiao xiongxoy ronxe999999 lizhen-dlut treper retopara irwenqiang lijinhui snejst pangbobo rmanor alphaprime jasimpson tmacmilan bwang0 huaijin-chen liyanghua timedcy cvfish erkang fifar waawaachi drbeiliu fbashir splade devkhokhar charnugagoo jltw mrgloom xiaoyili zczc shipoopi albertoandreottiatgmail colingwuyu ifloating wuchen112 amer47 hestendelin phecy mx70 kirk86 visonxuan chengat1314 zxwu zhuxiaoqiang wanli210 hujinshui jieyueli nklq05 piotrwalaszek web5design imclab chilimangoes oleander111 cesine houkun chowchow316 git-zhyifeng belearning xshhhm

deeplearntoolbox's Issues

random vs. zero initialization of weights

Perhaps I'm mistaken, but it seems you initialize (the DBN) weights to all zeros before training, rather than doing a random weight initialization. Is there a reason for this? It seems from my reading that random weight initialization's would be best.

[Help] Supervised training for time series forecast

Hello, I have some doubts about how to do a supervised training for time series forecasting.
I believe that the examples presented uses images as input, but for my case I have to use arrays of time windows as input and a value for the output of that time window (this value would be the forecasted value). Is it possible to do that, or the input for your functions have to be images? If not, how can I do that (time series forecasting)?
(I'm new to matlab and don't get what the inputs of your functions are)

Merging CNNs and NNs?

Is there a reason why the cnn code is separated from the "regular" nn code?
I think it is more natural to define a convolutional layer in a neural network rather than using a completely different code.

MSE calculation.

Hi Rasmus, I found the MSE calculated like this:

net.rL(end + 1) = 0.99 * net.rL(end) + 0.01 * net.L;

I'm confused with 0.99 .. why not use a ratio according to batch_size, could you explain for that? Thanks :)

CAE does not pass gradient check.

In caetrain.m, when I uncomment the numerical gradient checking, it does not pass.

test_cnn_gradients_are_numerically_correct

Hi, since the CNN used subsampling for pooling, I suppose the gradient won't be strictly followed the closed form. Therefore, numerical gradient shouldn't work perfectly in this example.
However, it does not generate error using subsampling pooling layer
Then I used maxpooling, and check the result--it generates the error for numerical gradient checking. Can not figure it out clearly ( i.e., subsampling works fine but not max pooling)
I am hoping to gain some insight from you.
Thank you in advance for you answer.
Regards

caebp.m inconsistent with my deduction

In caebp.m,
I think line 19 z = z + convn(cae.od{i}, flipall(cae.ok{i}{j}), 'full'); is not correct.

In my deduction, it should be convn(cae.od{i}, flipall(flipall(cae.ok{i}{j})))
The first flipall handles convn. And the second flipall is for the derivative of activation.

And also line 34
cae.dik{i}{j} = convn(cae.ad{j}, flipall(cae.i{i}), 'valid') / ns;
I think it should be flipall( convn(cae.ad{j}, flipall(cae.i{i}), 'valid') / ns );

Denoising AE

For DAE, shouldn't a noise be added to the input ?
But from my understanding of nntrain.m, a portion of the batch_x is forced to zero.
Is this the right approach ?

%Add noise to input (for use in denoising autoencoder)
if(nn.inputZeroMaskedFraction ~= 0)
batch_x = batch_x.*(rand(size(batch_x))>nn.inputZeroMaskedFraction);
end

I am new to GitHub, so forgive any formatting errors.

DBN code

When I execute test_example_DBN.m on matlab R2012a, I get this error:

Attempted to access lmisys(5); index out of bounds because numel(lmisys)=4.

Error in lmiunpck (line 23)
rs=lmisys(4); rv=lmisys(5); % row sizes of LMISET,LMIVAR

Error in nnsetup (line 26)
[LMI_set,LMI_var,LMI_term,data]=lmiunpck(lmisys);

Error in dbnunfoldtonn (line 10)
nn = nnsetup(size);

Error in test_example_DBN (line 34)
nn = dbnunfoldtonn(dbn, 10);

Error using the DBN Example

First off, I'm new to Matlab and I may be doing the wrong thing, secondly I already added the DeepLearning Toolbox to the path, but I'm getting the error message when I try to run the DBN example.
The error is:

Undefined function 'dbnsetup' for input arguments of type 'struct'.

Error in DBNexample (line 16)
dbn = dbnsetup(dbn, train_x, opts);

For all the other examples, I get similar errors, but it happens on different functions, of course.

It is possible that I added the path in the wrong way?
When I run the command 'load mnist_uint8', the functions are loaded correctly, but when I use the functions (running the example), they don't work, returning the message saying that the input argments is of different type.
I don't think this issue is the same of others reported here (matlab using a function of the same name, but not the function of the toolbox), because before I have added the toolbox to the path, the functions were not found, and after adding the path, this error occurs.

I cannot get this toolbox to work in Ocatve

I hope it is me, but I cannot get any of the examples to work. Could you help me?

What I did:

install octave on MacOSX according to the instructions at http://wiki.octave.org/Octave_for_MacOS_X (homebrew)
checkout the github code to DeepLearnToolbox
go to directory DeepLearnToolbox
start octave
(in octave) addpath(genpath('DeepLearnToolbox'));
(in octave):

function test_example_DBN
load data/mnist_uint8;

train_x = double(train_x) / 255;
test_x = double(test_x) / 255;
train_y = double(train_y);
test_y = double(test_y);

%% ex1 train a 100 hidden unit RBM and visualize its weights
rand('state',0); #rng(0);
dbn.sizes = [100];
opts.numepochs = 1;
opts.batchsize = 100;
opts.momentum = 0;
opts.alpha = 1;
dbn = dbnsetup(dbn, train_x, opts);
dbn = dbntrain(dbn, train_x, opts);
figure; visualize(dbn.rbm{1}.W'); % Visualize the RBM weights

%% ex2 train a 100-100 hidden unit DBN and use its weights to initialize a NN
rng(0);
%train dbn
dbn.sizes = [100 100];
opts.numepochs = 1;
opts.batchsize = 100;
opts.momentum = 0;
opts.alpha = 1;
dbn = dbnsetup(dbn, train_x, opts);
dbn = dbntrain(dbn, train_x, opts);

%unfold dbn to nn
nn = dbnunfoldtonn(dbn, 10);
nn.activation_function = 'sigm';

%train nn
opts.numepochs = 1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);

assert(er < 0.10, 'Too big error');
endfunction

run function: test_example_DBN

I get the following error:

octave:6> test_example_DBN
error: 'dbnsetup' undefined near line 16 column 7
error: called from:
error: test_example_DBN at line 16, column 5
octave:6> exit

It seems not all directories are searched for functions. For example, I had to include the "data" directory for the load command to succeed. Does anyone know how to solve this?

Error in test_example_SAE and saesetup

I Added two Lines "addpath"
such that the code is:

function test_example_SAE
addpath('D:/Programs/Deep Learning ToolBox/GitHup Repository To Modify/trunk/data');
load mnist_uint8;

train_x = double(train_x)/255;
test_x = double(test_x)/255;
train_y = double(train_y);
test_y = double(test_y);

%% ex1 train a 100 hidden unit SDAE and use it to initialize a FFNN
% Setup and train a stacked denoising autoencoder (SDAE)
addpath('D:\Programs\Deep Learning ToolBox\GitHup Repository To Modify\trunk\SAE');
rng(0);
sae = saesetup([784 100]);
sae.ae{1}.activation_function = 'sigm';
sae.ae{1}.learningRate = 1;
sae.ae{1}.inputZeroMaskedFraction = 0.5;
opts.numepochs = 1;
opts.batchsize = 100;
sae = saetrain(sae, train_x, opts);
visualize(sae.ae{1}.W{1}(:,2:end)')

% Use the SDAE to initialize a FFNN
nn = nnsetup([784 100 10]);
nn.activation_function = 'sigm';
nn.learningRate = 1;
nn.W{1} = sae.ae{1}.W{1};

% Train the FFNN
opts.numepochs = 1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.16, 'Too big error');

%% ex2 train a 100-100 hidden unit SDAE and use it to initialize a FFNN
% Setup and train a stacked denoising autoencoder (SDAE)
rng(0);
sae = saesetup([784 100 100]);
sae.ae{1}.activation_function = 'sigm';
sae.ae{1}.learningRate = 1;
sae.ae{1}.inputZeroMaskedFraction = 0.5;

sae.ae{2}.activation_function = 'sigm';
sae.ae{2}.learningRate = 1;
sae.ae{2}.inputZeroMaskedFraction = 0.5;

opts.numepochs = 1;
opts.batchsize = 100;
sae = saetrain(sae, train_x, opts);
visualize(sae.ae{1}.W{1}(:,2:end)')

% Use the SDAE to initialize a FFNN
nn = nnsetup([784 100 100 10]);
nn.activation_function = 'sigm';
nn.learningRate = 1;

%add pretrained weights
nn.W{1} = sae.ae{1}.W{1};
nn.W{2} = sae.ae{2}.W{1};

% Train the FFNN
opts.numepochs = 1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.1, 'Too big error');

when I Execute the file
I get the following Error (I am using MatlLab 2012b)

SAE Setup

Hi Rasmus,
I found an error in the initialisation of the SAE.
When you initialise the AE in saesetup.m you call
sae.ae{u} = nnsetup(struct('size', sae.size(u)), x, x);
but the nnsetup method accept only one parameter (size) as input.

thank you again for the wonderful job you're doing with this library!

example of DBN not working

1st i got this fault

dbn = dbntrain(dbn, train_x, opts);
Error: File: rbmtrain.m Line: 3 Column:
36
The input character is not valid in
MATLAB statements or expressions.

when i fix it i got this one

dbn = dbntrain(dbn, train_x, opts);
Operands to the || and && operators must
be convertible to logical scalar values.

Error in rbmtrain (line 3)
assert(all(x>=0) && all(x<=1), 'all
data in x must be in [0:1]');

Error in dbntrain (line 4)
dbn.rbm{1} = rbmtrain(dbn.rbm{1}, x,
opts);

Implementation of DAE

Hi, I have a question about of your implementation of DAE. In current version, you only treat DAE as a 3 layer FFNN. How can you ensure the weights of DAE are tied? Also when considering sDAE, only treating it as a FFNN also cannot ensure the weights are tied.

Too big error when use CNN

I am fresh in deep learning. When I try the CNN test example with first 3000 rows in the train_x, I find the error is so big. Even though I make test_x = train_x, the error is also too big. Is this states that small dataset not suitable for DL? By the way, the input data must be normaled in [0,1]?

cnnbp() error

I'm training a CNN with a 250x250 input image. The CNN has the same parameters as in your example program:

cnn.layers = {
    struct('type', 'i') %input layer
    struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %sub sampling layer
    struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %subsampling layer
};

with a batch size of 50.

On the first epoch evaluation I get the following error:

Error using  .* 
Array dimensions must match for binary array op.

Error in cnnbp (line 26)
                net.layers{l}.d{j} = net.layers{l}.a{j} .* (1 - net.layers{l}.a{j}) .* (expand(net.layers{l + 1}.d{j}, [net.layers{l + 1}.scale net.layers{l +
                1}.scale 1]) / net.layers{l + 1}.scale ^
Error in cnntrain (line 17)
            net = cnnbp(net, batch_y);

Setting a breakpoint reveals that net.layers{l}.a{j} is size [119 119 50] and (1 - net.layers{l}.a{j}) is size [119 119 50] while the last bit of the assignment is size [118 118 50].

I don't know enough about the CNN backprop to troubleshoot further.

da in sae

Hello,
I get confused when reading the code.In the SAE, the DA is the NN,which's inputZeroMaskedFraction be set to no-zero.To my knowledge ,in the DA, the weight of the second layer should be the transposition of the first layer, while I am able to get the constraints from the code.
Thanks in advance.

Best Wishes,
Li.L

Parameter issue of visualize function invoked in caeexamples.m file.

In caeexamples.m file, at last it invokes visualize() fucntion like this:

visualize(ff',1)

when I run caeexamples in my laptop (Matlab2012b), it returns like this:

Attempted to access mm(2); index out of bounds because numel(mm)=1.

Error in visualize (line 24)
    a=mm(2)*ones(num*s2+num-1,num*s1+num-1);

Error in caeexamples (line 32)
figure;visualize(ff',1)

I think 1 is considered as mm for visualize function, if we specify mm explicitly like this:

%Visualize the output kernels
ff=[];
for i=1:numel(cae.ok{1}); 
    mm = cae.ok{1}{i}(1,:,:); 
    ff(i,:) = mm(:); 
end; 
ff = ff';
figure;visualize(ff, [min(ff(:)) max(ff(:))], 1)

everything goes well.

Why sigmoid output use mse loss?

In nnff.m we have:

    case {'sigm', 'linear'}
        nn.L = 1/2 * sum(sum(nn.e .^ 2)) / m;

I think the loss nn.L = -sum(sum(y .* log(nn.a{n})) + (1-y) .* log(1-nn.a{n})) / m is more suitable for sigmoid output than the mse loss, isn't it? But in nnbp.m, what we use is not the derivative of the mse loss, but is the derivative of the loss I said. It is confused.

Reporting Bugs

While preparing another pull request I stumbled upon some bugs. I won't solve all of them! For point 4 I'd like whether I introduced the error!

If you run the second DBN example with opts.plot == 1, there will be a crash caused in nntrain.
Calling saetrain() in the second SAE example causes an memory exhausted error on my 32Bit machine.
Calling zscore in the first NN example causes an memory exhausted error on my 32Bit machine. I used a workaround, but maybe someone will solve this problem! Furthermor, calling nntrain in example 5 and 6 does the same, but I don't know a workaround.
The assertion of the first NN example fails. My error is 0.801. Did I introduce the error or was it before?
Calling cnnsetup in CNN test_example_CNN causes an memory exhausted error on my 32Bit machine.

sincerely

saetrain error using several hidden layers

Hi I get an error if i try to run the SAE with two hidden layers instead of one.

The following setup will reproduce the error:

function test_example_SAE
load mnist_uint8;

train_x = double(train_x)/255;
test_x  = double(test_x)/255;
train_y = double(train_y);
test_y  = double(test_y);

%%  ex1 train a 100 hidden unit SDAE and use it to initialize a FFNN
%  Setup and train a stacked denoising autoencoder (SDAE)
rng(0);
sae = saesetup([784 100 100]);
sae.ae{1}.normalize_input           = 0;
sae.ae{1}.activation_function       = 'sigm';
sae.ae{1}.learningRate              = 1;
sae.ae{1}.inputZeroMaskedFraction   = 0.5;

sae.ae{2}.normalize_input           = 0;
sae.ae{2}.activation_function       = 'sigm';
sae.ae{2}.learningRate              = 1;
sae.ae{2}.inputZeroMaskedFraction   = 0.5;

opts.numepochs =   1;
opts.batchsize = 100;
sae = saetrain(sae, train_x, opts);
visualize(sae.ae{1}.W{1}(:,2:end)')

% Use the SDAE to initialize a FFNN
nn = nnsetup([784 100 100 10]);
nn.normalize_input                  = 0;
nn.activation_function              = 'sigm';
nn.learningRate                     = 1;

%add pretrained weights
nn.W{1} = sae.ae{1}.W{1};
nn.W{2} = sae.ae{2}.W{1};


% Train the FFNN
opts.numepochs =   1;
opts.batchsize = 100;
nn = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.16, 'Too big error');

I think the error arises because the "bias" columns of ones is carried from the first autoencoder to the second.

The following change to saetrain makes the code run:

function sae = saetrain(sae, x, opts)
    for i = 1 : numel(sae.ae);
        disp(['Training AE ' num2str(i) '/' num2str(numel(sae.ae))]);
        sae.ae{i} = nntrain(sae.ae{i}, x, x, opts);
        t = nnff(sae.ae{i}, x, x);
        x = t.a{2};
        x = x(:,2:end);   %added
    end
end

Awesome toolbox btw

-Søren

Octave : `rng' undefined

octave:15> test_example_CNN
warning: load: file found in load path
error: `rng' undefined near line 12 column 1
error: called from:
error: /home/DeepLearnToolbox/tests/test_example_CNN.m

license

Please include a LICENSE file.

(By the way, thank you for DeepLearnToolbox.)

negative data always result in a 'degenerate' network

Hi there
I tried to figure out how to modify the code to be able to apply the DBN on data with some negative values, no luck -_- thou.

I tested the data with other machine learning algos (not from this package) and there is always some more or less meaningful solution learned, so it seems the input space is not badly conditioned and that this toolbox is limited to positive data only(?)

How to go about it?

Any feedback appreciated.
fritol

visualization of sae

when running the example of sae:

Attempted to access mm(2); index out of bounds because numel(mm)=1.

Error in visualize (line 24)
a=mm(2)_ones(num_s2+num-1,num*s1+num-1);

Error in caeexamples (line 32)
figure;visualize(ff',1)

dbnexample have some problem

the example of dbn have some problem,

CNN,not a bug

you set net.layers{l}.b{j} = 0; in cnnsetup.m,line 10
but you didn't use it when training CNN,
I guess you want to do it like "Notes on Convolutional Neural Networks" at begining
but finally implemented a simple one.

Running a DBN

The example shows how to train a DBN and visualize its weights, but doesn't give an example of how to apply the DBN to new data.
Suppose I have trained the DBN on the mnist training set. Now I want to present a new handwritten digit and see what the response of each of the hidden units is to that image. How would I go about that?
Thanks!

typo in comment

RMB should be RBM

figure; visualize(dbn.rbm{1}.W', 1); % Visualize the RMB weights

Visualising Weights learnt by CNN

Hi,

Can you please tell me how to visualise the features learnt by CNN, especially in layer 1?

Thanks :).

Confusion about nn

Hello,
I am reading the code of deep learning toolbox.I found the code line
net.a{i} = sigm(repmat(net.b{i - 1}', m, 1) + net.a{i - 1} * net.W{i - 1}');
in nnff.m line 10.
Shouldn't it be
net.a{i} = sigm(repmat(net.b{i - 1}, m, 1) + net.W{i - 1}*net.a{i - 1} );
or I lost some details that lead it to be the former code?

Best Wishes,
rustle

test_cnn_gradients_are_numerically_correct fails in Octave 3.6.2

The CNN example in the README does not seem to be working. I noticed this specific test, and it also fails. I'm sorry that I'm unable to submit a patch for this issue, but wanted to make you aware of it.

GNU Octave, version 3.6.2
Copyright (C) 2012 John W. Eaton and others.
This is free software; see the source code for copying conditions.
There is ABSOLUTELY NO WARRANTY; not even for MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. For details, type `warranty'.

Octave was configured for "x86_64-pc-linux-gnu".

Additional information about Octave is available at http://www.octave.org.

Please contribute if you find this software useful.
For more information, visit http://www.octave.org/help-wanted.html

Read http://www.octave.org/bugs.html to learn how to submit bug reports.

For information about changes from previous versions, type `news'.

warning: X11 DISPLAY environment variable not set
octave:1> addpath(genpath('DeepLearnToolbox'));
warning: function DeepLearnToolbox/util/xunit/runtests.m shadows a core library function
octave:2> test_cnn_gradients_are_numerically_correct
warning: nested functions are coerced into subfunctions in file /home/ubuntu/vol/DeepLearnToolbox/CNN/cnnbp.m
error: numerical gradient checking failed
error: called from:
error: /home/ubuntu/vol/DeepLearnToolbox/CNN/cnnnumgradcheck.m at line 63, column 33
error: /home/ubuntu/vol/DeepLearnToolbox/tests/test_cnn_gradients_are_numerically_correct.m at line 15, column 1
octave:2>

[patch] cnnff fails to evaluate singe (1 row) data instance (batchsize=1)

Hello,
when experimenting with CNN, I hit following issue:

>> c=cnnff(c,test)
Error using  - 
Matrix dimensions must agree.
Error in cnnff (line 11)
                z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1
                0]);

where test is (28x28)-transformed from one instance(row 784) data.
Problem is in size(), a=ones(28,28,1); size(a) is [28 28], so we are i number short.

A workaround is instead of classifying instances one-by-one, classify them all in block together.

So,

for i=1:100
   test=data(i,:);
   test = double(reshape(test',28,28,size(test,1)));
   c=cnnff(c, test);
end

Fails, while

  test=data;
  test = double(reshape(test',28,28,size(test,1)));
   c=cnnff(c, test);

is ok.

One solution seems switch for the size = 1 in 3rd dimension:
pseudo diff cnnff.m:

%  !!below can probably be handled by insane matrix operations
            for j = 1 : net.layers{l}.outputmaps   %  for each output map
                %  create temp output map
 + % only 1 instance (1 row) makes problems with size(ones(28,28,1)) == [28 28] 
 +             if(size(net.layers{l - 1}.a{1},3) == 1)
 +                   z = zeros([size(net.layers{l - 1}.a{1}) 1] - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
 +             else
                z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
 +              end
                for i = 1 : inputmaps   %  for each input map
                    %  convolve with corresponding kernel and add to temp output map
                    z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j}, 'valid');
                end

Thank you for totally amazing suit for deep learning! Just won school project in recognition ;)
I will use it for thesis, so hope more patches & enhancements come.

Best wishes, Mark

Function caesdlm in caetrain.m of CAE (convolution autoencoder) project.

In caetrain.m file, I'm confused what's the purpose of this statement:

 cae = caesdlm(cae, opts, m);

In my opinion, this statement is useless, am I right?
Thanks.

zscore() isn't a matlab built-in function

The new update uses the ''zscore()'' function to normalize the data. Just a FYI that that function isn't not part of the base matlab tool box.

the RBM issues

I see you implement CD1 in DBN\rbmtrain.m
but according to the pseudo code in 《Learning Deep Architectures for AI》
the h2 is not sampling from v2,so why you use sigmrnd to sample it.
or maybe both of sigm() and sigmrnd() are right?
Thanks

poor performance in dropout example

The default learning rate seems to be too high in the example.

This is the default:

octave-3.6.2.exe:20> nn = nnsetup([784 800 800 10]);
octave-3.6.2.exe:21> nn.dropoutFraction = 0.5;
octave-3.6.2.exe:22> nn.alpha  = 1e1;       %  Learning rate
<umepochs = 10;   %  Number of full sweeps through data
<atchsize = 100;   %  Take a mean gradient step over this many samples
octave-3.6.2.exe:25> nn = nntrain(nn, train_x, train_y, opts);
epoch 1/10. Took 62.275 seconds. Mean squared error is 0.50315
epoch 2/10. Took 61.807 seconds. Mean squared error is 0.50001
epoch 3/10. Took 61.463 seconds. Mean squared error is 0.5
epoch 4/10. Took 61.635 seconds. Mean squared error is 0.5
epoch 5/10. Took 61.791 seconds. Mean squared error is 0.5
epoch 6/10. Took 61.947 seconds. Mean squared error is 0.5
epoch 7/10. Took 62.275 seconds. Mean squared error is 0.5
epoch 8/10. Took 61.604 seconds. Mean squared error is 0.5
epoch 9/10. Took 62.167 seconds. Mean squared error is 0.5
epoch 10/10. Took 62.29 seconds. Mean squared error is 0.5
octave-3.6.2.exe:26> [er, bad] = nntest(nn, test_x, test_y);
octave-3.6.2.exe:27> disp([num2str(er * 100) '% error']); 
89.68% error

Here is a better choice:

octave-3.6.2.exe:28> nn = nnsetup([784 800 800 10]);
octave-3.6.2.exe:29> nn.dropoutFraction = 0.5;
octave-3.6.2.exe:30> nn.alpha  = 1;       %  Learning rate
<umepochs = 10;   %  Number of full sweeps through data
<atchsize = 100;   %  Take a mean gradient step over this many samples
octave-3.6.2.exe:33> nn = nntrain(nn, train_x, train_y, opts);
epoch 1/10. Took 62.15 seconds. Mean squared error is 0.3265
epoch 2/10. Took 61.291 seconds. Mean squared error is 0.20155
epoch 3/10. Took 61.416 seconds. Mean squared error is 0.15969
epoch 4/10. Took 61.931 seconds. Mean squared error is 0.14041
epoch 5/10. Took 61.556 seconds. Mean squared error is 0.12736
epoch 6/10. Took 61.868 seconds. Mean squared error is 0.11906
epoch 7/10. Took 61.448 seconds. Mean squared error is 0.11025
epoch 8/10. Took 61.526 seconds. Mean squared error is 0.10517
epoch 9/10. Took 61.354 seconds. Mean squared error is 0.10157
epoch 10/10. Took 61.416 seconds. Mean squared error is 0.096239
octave-3.6.2.exe:34> [er, bad] = nntest(nn, test_x, test_y);
octave-3.6.2.exe:35> disp([num2str(er * 100) '% error']); 
8.39% error

Too big error

I am running test_example_SAE, but getting a "Too big error". Is there a way to prevent this error? What can I do?

test_example_SAE
warning: load: file found in load path
Training AE 1/1
epoch 1/1. Took 9.3665 seconds. Mean squared error on training set is 10.61
epoch 1/1. Took 3.6861 seconds. Mean squared error on training set is 0.2197
Training AE 1/2
epoch 1/1. Took 9.4617 seconds. Mean squared error on training set is 10.641
Training AE 2/2
epoch 1/1. Took 2.0744 seconds. Mean squared error on training set is 3.6691
epoch 1/1. Took 4.57 seconds. Mean squared error on training set is 0.17955
error: Too big error
error: called from:
error: /opt/boxen/homebrew/Cellar/octave/3.6.4/share/octave/3.6.4/m/testfun/assert.m at line 74, column 9
error: /Users/nederhrj/src/DeepLearnToolbox/tests/test_example_SAE.m at line 65, column 1

Need explaination with regards to CNNsetup

in cnnsetup.m what the meaning var fan_out, fan_in, fvnum, onum, net.ffb, net.ffW.
and what is the following code means ?
net.layers{l}.k{i}{j} = (rand(net.layers{l}.kernelsize) - 0.5) * 2 * sqrt(6 / (fan_in + fan_out));

Thank you, I am newbie in deep learning

classes always one for SDAE

I try to reuse your code for another application and I found out that SDAE always returns class set to 1 regardless of dataset what i'm using (in my case I use just one class/label). My application also using other algorithms and those algorithms are able to classify with no problem on those datasets so it is clear SDAE coding problem. Here is my setup

% Setup and train a stacked denoising autoencoder (SDAE)
rand('state',0)
sae = saesetup([size(xtrain,2) size(xtrain,2)]);
sae.ae{1}.activation_function = 'tanh_opt';
sae.ae{1}.learningRate = 0.1;
sae.ae{1}.inputZeroMaskedFraction = 0.5;
opts.numepochs = epoch;
opts.batchsize = 100;
sae = saetrain(sae, xtrain, opts);
visualize(sae.ae{1}.W{1}(:,2:end)')

% Use the SDAE to initialize a FFNN
nn = nnsetup([size(xtrain,2) size(xtrain,2),1]);
nn.activation_function = 'tanh_opt';
nn.learningRate = 0.1;
nn.W{1} = sae.ae{1}.W{1};

% Train the FFNN
opts.numepochs = epoch;
opts.batchsize = 100;

model = nntrain(nn, xtrain, ytrain, opts);

classes = nnpredict(model, xtest);

datasets (xtrain, ytrain, xtest, ytest) are available on request. I'm running it
on MATLAB not OCTAVE.

Krzysztof

using 'linear' output seems to result in divergence/ NaN's very quickly instead of finding good weights

Hello,
Using Matlab's house_dataset from its neural networks toolbox as a test example for a linear rather than logistical problem, seems to crash the back propagation and training algorithms. Printing intermediate results out shows the weights diverge to astronomical numbers very quickly.

The house_dataset example has training data that's 13 x 506, and 'label' data that's 1 x 506 because the 'label' is a real number not a logical.

Example code below shows rapid divergence of nn.W and nn.a's

clear
load house_dataset ;% use test house dataset, 13 variables in houseInputs,
% for house values, houseTargets, with 506 different samples. so
% houseInputs is 13 x 506, houseTargets is 1x506.

inputs = houseInputs(:,1:400)'; %select training data and transpose to deeplearntoolbox format
targets = houseTargets(1:400)'; %from Matlab format

rand('state',0);
nn = nnsetup([size(inputs,2) 100 1]);
nn.output='linear'; %this is a linear problem
opts.numepochs = 1; % Number of full sweeps through data
opts.batchsize = 1; % Take a mean gradient step over this many samples
[nn, L] = nntrain(nn, inputs, targets, opts);

trainpredict= houseInputs(:,401:420)';
targetpredict = houseTargets(1,401:420)';
nullpredict = zeros(1,20)';
nn.testing = 1;

nn = nnff(nn, trainpredict, nullpredict);
nn.testing = 0;

test_cnn_gradients_are_numerically_correct

L2 regularization of bias terms

It looks like there is L2 regularization on the bias terms in nnapplygrads.m. Is that intentional?

line 7-11: nnapplygrads.m
if(nn.weightPenaltyL2>0)
dW = nn.dW{i} + nn.weightPenaltyL2 * nn.W{i};
else
dW = nn.dW{i};
end

Casper

Denoising AE

For DAE, shouldn't a noise be added to the input ?
But from my understanding of nntrain.m, a portion of the batch_x is forced to zero.
Is this the right approach ?

%Add noise to input (for use in denoising autoencoder)
if(nn.inputZeroMaskedFraction ~= 0)
batch_x = batch_x.*(rand(size(batch_x))>nn.inputZeroMaskedFraction);
end

I am new to GitHub, so forgive any formatting errors.

assert statement in rbmtrain inverted

Hi!

The following assert statement in rbmtrain.m, line 6, seems incorrect
assert(rem(numbatches, 1) ~= 0, 'numbatches not integer');

Simply switch to
== instead of ~=

Thanks for the effort you put into this toolbox.
I just found it and am currently checking it out.

code contributions

Are you interested in code contributions?

I have written a simple wrapper function around nntrain which evaluates training and validation performance after each epoch. It optionally plots the training progress and saves the network after n epoch in case you want to stop training early.
I think it would fit nicely in your examples.

NaN error

Following code will generate a NaN error.

load mnist_uint8;

train_x = double(train_x) / 255;
test_x = double(test_x) / 255;
train_y = double(train_y);
test_y = double(test_y);

% normalize
% [train_x, mu, sigma] = zscore(train_x);
% test_x = normalize(test_x, mu, sigma);

%% ex1 vanilla neural net
rand('state',0)
nn = nnsetup([784 100 10]);
nn.output = 'softmax';
opts.numepochs = 1; % Number of full sweeps through data
opts.batchsize = 100; % Take a mean gradient step over this many samples
[nn, L] = nntrain(nn, train_x, train_y, opts);

[er1, bad] = nntest(nn, test_x, test_y);

Issue in calling visualize in test_example_DBN

https://github.com/rasmusbergpalm/DeepLearnToolbox/blob/master/tests/test_example_DBN.m#L18

Line 18 of test_example_DBN calls visualize to display the weights. It passes a second argument of 1. This argument looks like it's supposed to be the display range for the image, but never does any checking to make sure it has 2 elements, but lines 24 and 46 of visualize both use the second element.

By omitting the 1 argument, and letting visualize calculate it's own scale, it seems to work fine.

Comments are lacking; source code hard to understand

Hi there,
I try to use the DeepLearnToolbox in my Master Thesis. Therefore I'd like to understand the algorithms and maybe change them for my purposes. I bet I'm not the only one wanting to understand the source code.

Unfortunately, "dbntrain" and "rbmtrain" are both not documented at all. Could please anyone improve this? I can't, since I don't understand huge parts of "rbmtrain".
Furthermore, tanh_opt not being documented in source but in git log can be explained only be ill will.

Another problem is, that most of the variable names are very short. Self describing names are widely accepted as good style, so maybe some variable names like "vW","vb", "vc", "kk" could be improved.

I really hope that this will improve soon and look forward to work with this toolbox.