lasagne / recipes Goto Github PK

Lasagne recipes: examples, IPython notebooks, ...

License: MIT License

Python 100.00%

recipes's Issues

VGGNet non-commercial use only?

The vgg16 pretrained model claims to be for non-commercial use only, while the VGGNet website says "The models are released under Creative Commons Attribution License."

What is the reason for the limitation?

Broken link to ResNet50 weights

The link to the ResNet50 lasagne weights appears to be broken or not having the right permissions.
i.e the download link in https://github.com/Lasagne/Recipes/blob/master/modelzoo/resnet50.py doesn't work (https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/resnet50.pkl)

Cheers
Casper

ResNet-50 from Caffe to Lasagne

Hi, I made script for transferring weights from caffe ImageNet pretrained ResNet-50 to lasagne https://github.com/mephistopheies/resnet50caffe2lasagne/blob/master/resnet-50.ipynb, like in existing one https://github.com/Lasagne/Recipes/blob/master/examples/Using%20a%20Caffe%20Pretrained%20Network%20-%20CIFAR10.ipynb for VGG.

Would it be useful to have such recipe script for "modelzoo" folder? and/or "examples" folder?

Deep_Residual_Learning_CIFAR-10.py

Run the model (cifar_model_n5.npz) of the saved here file. I get the following results

Loading data...
Building model and compiling functions...
number of parameters in model: 464154
testing phase!
Final results:
test loss: 3.326520
test accuracy: 40.55 %

I do not understand why such a low accuracy.

caffe model to lasagne example not working

i tried to run the example of model conversion using the exact same code from the notebook, only 0.406 accurcy is reported instead of 0.894. Wonder if something is broken (some intermediate results are not the same, e.g. conv1).

Deep nets with stochastic depth

I hope this is the appropriate venue to post this. I don't have an implementation yet, but maybe this ticket could encourage some work.

I am currently interested in this stochastic depth paper:

http://arxiv.org/pdf/1603.09382v2.pdf

I was going to have a go in implementing this, but I was a bit stumped as to how one would go about the identity transform that is mentioned in equation (2). As you can see, if the next layer and the current layer have different output shapes, you need to linearly project the output of the current layer so that it matches the dimensions of the output of the following layer. I'm not clear on how this is done and am afraid it's blatently obvious... is your "projection matrix" (or whatever it's called) a matrix (of some appropriate shape) consisting solely of ones? Furthermore, how would we do this for convolution networks?

It seems like that's the only roadblock for me -- the binomial mask is easy to do.

Let me know what you think.

PS: Interesting, I found a post asking on how to go about implementing this, but it seems to omit the identity transform:

https://www.reddit.com/r/MachineLearning/comments/4dr998/askreddit_has_anyone_implemented_resnets_with/

CTC

Are anyone interested in CTC?

I ported a theano CTC to lasagne: https://github.com/skaae/Lasagne-CTC

Problem is I haven't really had time to test it and I do not have a nice dataset for testing.

Index-learning of unsupervised low dimensional embeddings

Just thought we could collect some ideas for things that'd be reasonably easy to turn into Lasagne Recipes. One of them is Ben Graham's "Index-learning of unsupervised low dimensional embeddings", available at: http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/graham/indexlearning.pdf

Anybody who's interested in doing this, raise your hand here! :)

Inception_v3 model, the mean in batch norm parameters of the pretrained model is negative??

Hi all,

Just tried the inception model and couldn't get it to work. It always gives a very low accuracy.

1, I checked the prerained weights, and there are negative values in the mean of batch norm. However, batch norm is after the convolution layer with ReLU units which always produces nonnegative values. So the mean should be always nonnegative. Why the mean in the pretrained weights is negative?

2, In the Lasagne document shown as the following, batch norm should be used after the convolution and before ReLU. Is this the case or should it be used after ReLU and before convolution of next layer? Because going through ReLU will always suppress the negative values, it would not make much sense in inserting between convolution and ReLU?

"This layer should be inserted between a linear transformation (such as a DenseLayer, or Conv2DLayer) and its nonlinearity. The convenience function batch_norm() modifies an existing layer to insert batch normalization in front of its nonlinearity."

3, There are 1008 output units in the model, while there are only 1000 classes in ImageNET. From this line (https://github.com/soumith/inception.torch/blob/master/googlenet.lua#L113), I found that only the 2-1001 classes are used for ImageNET. Is this true? The I used the following operation. Just want to check with you guys that this is true.

T.set_subtensor(net['softmax'].W[0:1000], inceptionmodel_param_values[470][1:1001])
T.set_subtensor(net['softmax'].b[0:1000], inceptionmodel_param_values[471][1:1001])

Thanks a lot.

Model accuracy with pretrained weights is always lower than reported. For vgg, it is about 69% (while reported 72.7%), and for inception_v3, it is 78% (while reported 81.3%)

Hi all,

As shown in the subject, I cannot get the accuracy reported by their paper with the pretrained weights, for both VGG and inception_v3 (from ImageNET classification). It is always about 3% lower. For vgg, it is about 69% (while reported 72.7%), and for inception_v3, it is about 78% (while reported 81.3%).

It is driving me crazy. Any advice would be highly appreciated.

My setting,
1, Data is preprocessed using caffe to produce lmdb. It is resized to 256 for vgg and 384 for inception.
2, When testing, the data is oversampled with 10 crops (4 corners with center crop, plus flip). This is slightly different from the paper, but it should not cause 3% difference. It only improves about 1% using oversample compared to using the center crop. Oversample code is attached at the end.
3, The pretrained weights are downloaded from the model zoo.
4, Model is tested using GPU with theano configuration as THEANO_FLAGS='floatX=float32,device=gpu0,mode=FAST_RUN, nvcc.fastmath=True'.

Oversampling code:

        datum.ParseFromString(value)
        label = datum.label
        img = np.array(bytearray(datum.data)).reshape(datum.channels, datum.height, datum.width)
        for oversamplei in range(5):
            dx=self.cropindex[oversamplei][0]
            dy=self.cropindex[oversamplei][1]
            tempimg = img[:,dy:dy+self.crop_height,dx:dx+self.crop_width]
            for flipi in range(2):
                if flipi==1:
                    tempimg = tempimg[:,:,::-1]         

                self.data_batches[i*10+oversamplei*2+flipi,:,:,:] = tempimg-BGR_mean
                self.labels_batches[i*10+oversamplei*2+flipi] = np.int32(label)

Test code:

test_vggprediction = lasagne.layers.get_output(vggmodel['prob'], X_sym, deterministic=True)
_,vggprediction_shape=test_vggprediction.shape
temptest_vggprediction=test_vggprediction.reshape((-1,10,vggprediction_shape))
lable_oversample=y_sym[::10]
test_vggprediction_oversample=T.mean(temptest_vggprediction,axis=1,dtype=theano.config.floatX)
test_vggacc=T.mean(lasagne.objectives.categorical_accuracy(test_vggprediction_oversample, lable_oversample, top_k=1),dtype=theano.config.floatX)
test_vggacc_top5=T.mean(lasagne.objectives.categorical_accuracy(test_vggprediction_oversample, lable_oversample, top_k=5),dtype=theano.config.floatX)

Again, any advice would be highly appreciated.

Spatial Transformer notebook falsely uses "float32"

Dears,

I tried https://github.com/Lasagne/Recipes/blob/master/examples/spatial_transformer_network.ipynb

but I get an error about types :

X = T.tensor4()
y = T.ivector()

# training output
output_train = lasagne.layers.get_output(model, X, deterministic=False)

# evaluation output. Also includes output of transform for plotting
output_eval, transform_eval = lasagne.layers.get_output([model, l_transform], X, deterministic=True)

sh_lr = theano.shared(lasagne.utils.floatX(LEARNING_RATE))
cost = T.mean(T.nnet.categorical_crossentropy(output_train, y))
updates = lasagne.updates.adam(cost, model_params, learning_rate=sh_lr)

train = theano.function([X, y], [cost, output_train], updates=updates)
eval = theano.function([X], [output_eval, transform_eval])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-c9e0c22f04d4> in <module>()
     12 updates = lasagne.updates.adam(cost, model_params, learning_rate=sh_lr)
     13 
---> 14 train = theano.function([X, y], [cost, output_train], updates=updates)
     15 eval = theano.function([X], [output_eval, transform_eval])

/usr/local/lib/python2.7/site-packages/theano/compile/function.pyc in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    315                    on_unused_input=on_unused_input,
    316                    profile=profile,
--> 317                    output_keys=output_keys)
    318     # We need to add the flag check_aliased inputs if we have any mutable or
    319     # borrowed used defined inputs

/usr/local/lib/python2.7/site-packages/theano/compile/pfunc.pyc in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    487                                          rebuild_strict=rebuild_strict,
    488                                          copy_inputs_over=True,
--> 489                                          no_default_updates=no_default_updates)
    490     # extracting the arguments
    491     input_variables, cloned_extended_outputs, other_stuff = output_vars

/usr/local/lib/python2.7/site-packages/theano/compile/pfunc.pyc in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
    202                        ' function to remove broadcastable dimensions.')
    203 
--> 204             raise TypeError(err_msg, err_sug)
    205         assert update_val.type == store_into.type
    206 

TypeError: ('An update must have the same type as the original shared variable (shared_var=<CudaNdarrayType(float32, vector)>, shared_var.type=CudaNdarrayType(float32, vector), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, vector)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

Where could this come from ?
Best

A couple of pretrained models?

I have Lasagne versions of vgg-16, vgg-19 and googlenet with their pretrained parameters:
http://www.vlfeat.org/matconvnet/pretrained/
https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet

Would there be interest in adding these recipes? I reckon these recipes are more the microwave lasagne kind, but this might interest people too?

`flip_filters` in Conv2DDNNLayer vs Conv2DLayer

It's a common pitfall that the model zoo defines its models to use the Conv2DDNNLayer, implicitly defaulting to flip_filters=False. When people modify it to use the Conv2DLayer (which defaults to flip_filters=True), the models don't work as expected.

With the bleeding-edge version of Lasagne, there are three options to resolve this:

Explicitly add flip_filters=False to the constructors
Use Conv2DLayer and explicitly add flip_filters=False to the constructors (should be the same performance)
Use Conv2DLayer with the default flip_filters=True setting and update the pickle files accordingly

Recipes for vgg networks missing activations

edit: Looks like I misread.

fast and faster R-CNN

Is it worth implementing the fast R-CNN and faster R-CNN with Lasagne? Both are state-of-the-art image detection methods.

My current guess is faster R-CNN is easier since it purely relies on two networks (one for proposal, another for detection) and would like to work on it. Anyone has tried that?

saliency maps for all layers

Hi, I am currently modifying the saliency maps of

https://github.com/Lasagne/Recipes/blob/master/examples/Saliency%20Maps%20and%20Guided%20Backpropagation.ipynb

so that I can plot out the saliencies for all layers and every 8 filters in the vgg net. Here is what I did, but the function compilation stage is prohibitively slow. It seems to me that the gradient loop did not properly exploit the stacked structure of the vgg net and has to go through the graph every single time.

I am just wondering is there a better way to do it? Thanks!

def compile_saliency_function1(net,layernamelist,layershapelist,scalefactor):
    inp = net['input'].input_var
    outp = lasagne.layers.get_output([net[layername] for layername in layernamelist], deterministic=True)
    saliencyfnlist=[]    
    for layeri in range(len(layernamelist)):
        filtercount=int(layershapelist[layeri]/scalefactor)
        filterindices=[ii*scalefactor for ii in range(filtercount)]
        layeroutp=outp[layeri]
        saliencylayerlist=[]
        for filterindex in filterindices:
            max_outpi=layeroutp[0,filterindex,]
            saliencylayerlist.append(theano.grad(max_outpi.sum(), wrt=inp))     
        print(len(saliencylayerlist))
        layerfnlist=theano.function([inp], saliencylayerlist)
        saliencyfnlist.append([layerfnlist]) 
    return saliencyfnlist

starttime=time.time()
saliencyfntuple=compile_saliency_function1(net,['conv5_1','conv5_2','conv5_3'],[512,512,512],8)
print('fn time',time.time()-starttime)

Wrong order of stride and pad arguments in build_simple_block

In the function build_simple_block in resnet50.py, the stride and pad arguments to the Conv2DLayer (ConvLayer) constructor are given in the order pad then stride (if use_bias is True, not if use_bias is False), but the Conv2DLayer constructor takes those two arguments in the order stride then pad.

Training C3D

@gyglim
I was wondering if you could share your code on training the C3D (e.g. training dataset, optimizer). I am trying to train C3D on my own data but either get NaN or memory problems. How did you manage to train such a massive network? Thanks

Call for content

@benanne @dnouri @craffel @f0k @skaae (and anyone else of course)

With first release imminent it would be nice to have a bit more here... I know a couple of you have stuff written up already, but I bet everyone has some suitable code lying around.

If you have anything you're willing to contribute, please open a PR... don't worry if it's not perfect, I can take care of making sure everything functions with the latest Lasagne.

rgb image input

Hi,
How I do training for RGB image input. Do I need 3 input layers? If I use '3' instead of shape[1] does this mean, all channels are trained in isolation? Can I use more than 3 channels?

ini = lasagne.init.HeUniform()
l_in = lasagne.layers.InputLayer(shape=(None, 1, input_width, input_height))

about the Conv2DLayer in DenseNet

I found the nonlinearity in Conv2DLayer of densenet.py and densenet_fast.py is None. But in the doc of lasagne, it says that "nonlinearity : callable or None
The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear."
So, why the activation in Conv2DLayer is linear instead of relu?

@f0k

"Art Style Transfer" producing noise without cuDNN

I'm trying to run the Art Style Transfer

Because I don't have an nVidia GPU I changed the line:

from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer

from lasagne.layers import Conv2DLayer as ConvLayer

With the latest Lasagne (6674ed8a1ed6d4ed4c11e42a1cd809f8a84770c6) and Theano 0.8.2 installed via pip this runs without error, but produces noise for the optimized images:

My guess is that the pre-trained weights are not being applied correctly. But I'm not sure how to verify or fix this.

Seems related to previous issues

Shape mismatch using the CPU at the ImageNet Pretrained Network (VGG_S).ipynb

I'm having a mismatch problem when trying to set the weights at the model.

At first it was using the cuda implementation, but my computer doesn't have a cuda-enabled GPU, and then I've changed the conv model to the cpu implementation (Conv2DLayer).

Maybe can't I use "pad=1" in a Conv2DLayer?

from lasagne.layers import InputLayer, DenseLayer, DropoutLayer, Conv2DLayer as ConvLayer
# from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer

In [5]: ### Load the model parameters and metadata
import pickle

model = pickle.load(open('vgg_cnn_s.pkl'))
CLASSES = model['synset words']
MEAN_IMAGE = model['mean image']

lasagne.layers.set_all_param_values(output_layer, model['values'])


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-ec94c9846d22> in <module>()
      5 MEAN_IMAGE = model['mean image']
      6 
----> 7 lasagne.layers.set_all_param_values(output_layer, model['values'])

/Users/mac/Sites/python/Diseaselyzer/venv/lib/python2.7/site-packages/lasagne/layers/helper.pyc in set_all_param_values(layer, values, **tags)
    450             raise ValueError("mismatch: parameter has shape %r but value to "
    451                              "set has shape %r" %
--> 452                              (p.get_value().shape, v.shape))
    453         else:
    454             p.set_value(v)

ValueError: mismatch: parameter has shape (12800, 4096) but value to set has shape (18432, 4096)

Thanks in advance \o

the detail about inception v3

hello,i have try incetion_v3 for fine tune ,but the accurary is really low,worse than googlenet and vgg19,can u tell me the details about ur pretrained model,like learning_rate,batch_size,data preprocess etc..

Make Recipes a package?

As discussed in #16, it might be nice to make this repo (or a subset) into a package, so that things like pretrained models can be imported into users code.

Anyone have arguments against? Or feel like making a PR?

Dropout missing in GoogLeNet

In the googlenet example, is it possible that the dropout after the average pool layer is missing? Compare to Szegedy et al., "Going deeper with convolutions", page 5, table 1.

Memory used in models vgg

For classification problems, I use the model vgg. I have a computer with Debian 6.0.9, 32 bit, RAM 15.8Gb. When I try to load a model vgg19 or vgg16. I receive a memory error

cnn_vgg19 = build_model_vgg19()
output_layer = cnn_vgg19['prob']

MemoryError Traceback (most recent call last)
in ()
----> 1 cnn_vgg19 = build_model_vgg19()
2 output_layer = cnn_vgg19['prob']

in build_model_vgg19()
39 net['conv5_3'], 512, 3, pad=1, flip_filters=False)
40 net['pool5'] = PoolLayer(net['conv5_4'], 2)
---> 41 net['fc6'] = DenseLayer(net['pool5'], num_units=4096)
42 net['fc6_dropout'] = DropoutLayer(net['fc6'], p=0.5)
43 net['fc7'] = DenseLayer(net['fc6_dropout'], num_units=4096)

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/layers/dense.pyc in init(self, incoming, num_units, W, b, nonlinearity, **kwargs)
69 num_inputs = int(np.prod(self.input_shape[1:]))
70
---> 71 self.W = self.add_param(W, (num_inputs, num_units), name="W")
72 if b is None:
73 self.b = None

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/layers/base.pyc in add_param(self, spec, shape, name, **tags)
212 name = "%s.%s" % (self.name, name)
213 # create shared variable, or pass through given variable/expression
--> 214 param = utils.create_param(spec, shape, name)
215 # parameters should be trainable and regularizable by default
216 tags['trainable'] = tags.get('trainable', True)

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/utils.pyc in create_param(spec, shape, name)
349
350 elif hasattr(spec, 'call'):
--> 351 arr = spec(shape)
352 try:
353 arr = floatX(arr)

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in call(self, shape)
29 their :meth:sample() method.
30 """
---> 31 return self.sample(shape)
32
33 def sample(self, shape):

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in sample(self, shape)
175
176 std = self.gain * np.sqrt(2.0 / ((n1 + n2) * receptive_field_size))
--> 177 return self.initializer(std=std).sample(shape)
178
179

/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in sample(self, shape)
98 def sample(self, shape):
99 return floatX(get_rng().uniform(
--> 100 low=self.range[0], high=self.range[1], size=shape))
101
102

mtrand.pyx in mtrand.RandomState.uniform (numpy/random/mtrand/mtrand.c:13575)()

mtrand.pyx in mtrand.cont2_array_sc (numpy/random/mtrand/mtrand.c:2902)()

MemoryError:

There is a second computer RAM 32 Gb 64bit
on this computer is working correctly
i.e. 16 GB RAM is not enough
is it normal?

Questions about Neural Style implementation

I have a few questions about the notebook with the implementation of "A Neural Algorithm for Artistic Style". Hopefully this is the right place for them? It'd be easier to work with and make pull requests if this was a script rather than a notebook, but it's up to you.

Overall I think this is by far the prettiest implementation I've seen of the algorithm, and it's been a pleasure to work with. My questions:

Other implementations benefit from having input images that are multiple of 32, but don't use convolution layer padding. Here, Lasagne is used to add padding, so should the input images be multiples of 32 or with an additional +2 to width and height?
The borders of the computed images seem to have dark areas around them, then the style fades in towards the middle. Is this due to the way the borders are handled? I'd expect it to be not black but the color of the mean pixel, so I'm not sure what's going on.
I'm finding the lbfgs in scipy to be quite unstable (compared to the one in Torch used by Justin's implementation), as it often returns the error below. This seems to be quite random depending on image size/parameters, and adding new features to the algorithm isn't helping. Any ideas?

Bad direction in the line search;
   refresh the lbfgs memory and restart the iteration.

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
*****    211    259      2     0     0   1.204D-03   2.472D+03
  F =   2472.1531027869614

ABNORMAL_TERMINATION_IN_LNSRCH

 Line search cannot locate an adequate point after 20 function
  and gradient evaluations.  Previous x, f and g restored.
 Possible causes: 1 error in function or gradient evaluation;
                  2 rounding error dominate computation.

I've noticed that the GPU memory during execution is constantly fluctuating, presumably because Theano allocates and de-allocates buffers. Is there a way to force it to just allocate the buffers it needs once then keep them in memory throughout the process?

Thanks again for the code, I've been very impressed with Lasagne because of it!

Caffe's bvlc_googlenet and Lasange's GoogLeNet do not produce the exact same results

I have experimented with Lasagne's googlenet.py and I have noticed that although it produces results similar to Caffe's bvlc_googlenet they are not exactly the same. (By results I mean the output of the network after the softmax or the probabilities)

Should they produce the same results?

Wrong number of dimensions: expected 4, got 3 with shape

i am trying to successfully compile the style transfer example here:
https://github.com/Lasagne/Recipes/blob/master/examples/styletransfer/Art%20Style%20Transfer.ipynb

installed cud on my OS X machine got through some other cudnn issues.
now i am stuck at this error "'Wrong number of dimensions: expected 4, got 3 with shape (768, 1024, 3).')"

the internet things it's something about numpy misshaped array.
? thanks for any advice?

/Users/stephan/anaconda/bin/python /Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py
Using gpu device 0: GeForce GT 750M (CNMeM is disabled, CuDNN 4007)
Traceback (most recent call last):
  File "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py", line 150, in <module>
    photo_features = {k: theano.shared(output.eval({input_im_theano: photo})) for k, output in zip(layers.keys(), outputs)}
  File "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py", line 150, in <dictcomp>
    photo_features = {k: theano.shared(output.eval({input_im_theano: photo})) for k, output in zip(layers.keys(), outputs)}
  File "/Users/stephan/.local/lib/python3.5/site-packages/theano/gof/graph.py", line 523, in eval
    rval = self._fn_cache[inputs](*args)
  File "/Users/stephan/.local/lib/python3.5/site-packages/theano/compile/function_module.py", line 786, in __call__
    allow_downcast=s.allow_downcast)
  File "/Users/stephan/.local/lib/python3.5/site-packages/theano/tensor/type.py", line 177, in filter
    data.shape))
TypeError: ('Bad input argument to theano function with name "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py:150"  at index 0(0-based)', 'Wrong number of dimensions: expected 4, got 3 with shape (768, 1024, 3).')

Process finished with exit code 1

InputLayer size in the models googlenet, vgg16, vgg19, vgg_cnn_s

I looked at four models of Recipes/modelzoo/
it's googlenet.py, vgg16.py, vgg19.py, vgg_cnn_s.py
For models of vgg16.py, vgg19.py, vgg_cnn_s.py the input layer is set so InputLayer((None, 3, 224, 224)) those the input image has a size 224_224. Is it correctly if to load models such as (vgg16, vgg19) but change the input layer i.e. for classification and retraining the images will be a other size(128_128 or other).
Model googlenet in the input layer set without dimensions InputLayer((None, 3, None, None))
This means that the input image can be any size?

googlenet classification

Run the sample from Recipes/examples/ImageNet Pretrained Network (VGG_S).ipynb on the basis of this example I have similar programs for models vgg16 and vgg19. Models vgg 16 and vgg 19 has key 'mean value' therefore it is possible to calculate MEAN_IMAGE as in model vgg_s. Googlenet it has no parameter 'mean value' Нow to calculate the 'mean value' for the googlenet

Spatial Transformation fails with downsample_factor =1.0

I am running the Spatial Transformation example (https://github.com/Lasagne/Recipes/blob/master/examples/spatial_transformer_network.ipynb) downsample_factor =1.0 and I am getting the following error:

MemoryError: Error allocating 110231552 bytes of device memory (out of memory).
Apply node that caused the error: GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{Add}[(0, 0)].0)
Toposort index: 279
Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)]
Inputs shapes: [(1, 1, 1, 1), (256, 32, 58, 58)]
Inputs strides: [(0, 0, 0, 0), (107648, 3364, 58, 1)]
Inputs values: [<CudaNdarray object at 0x7fef7314c7f0>, 'not shown']
Outputs clients: [[GpuContiguous(GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}.0)]]

Can anyone replicate the same issue?

converting caffemodel to pkl model

Hi I have a trained vgg16 model on 2 classes for which I am going to visualize the saliency map using guided backpropagation. could you please let me know how I can change my caffemodel to the pkl model which you have used in the corresponding notebook for guided backpropagation?

Or at least how I am able to change .py lasagne model to .pkl model?

Best

Increasingly negative loss in variational autoencoder: is it normal?

Hi,
not sure if it's an issue.
I am training the variational autoencoder with a different set of images with 3 color channels.
I am getting an increasingly negative loss. I wonder: is this a normal or valid outcome or is it a bug?

Shouldn't the loss value be always a positive amount? I am worried because the search for
a minimum of the loss function might not get to anything if the loss is not bounded by 0.

Building model and compiling functions...
L = 2, z_dim = 1, n_hid = 3, binary=True
Starting training...
Epoch 1 of 300 took 36.576s
  training loss:        1193603.765134
  validation loss:      358401.526396
Epoch 2 of 300 took 34.345s
  training loss:        170094.748865
  validation loss:      -990985.720292
Epoch 3 of 300 took 34.682s
  training loss:        -948598.243076
  validation loss:      -2374793.240720
Epoch 4 of 300 took 33.571s
  training loss:        -2179357.580108
  validation loss:      -3822347.805930
Epoch 5 of 300 took 36.031s
  training loss:        -3293897.853456
  validation loss:      -5299324.057571

I tried with 3 or 1024 hidden units, and z dimension being either 1 or 2, but the result is the same.
Using the regular MNIST I have no issues: the loss value is positive and decreasing toward 0.

draw_net bug (very minor)

Hi,

It would be nice to have this as part of the Recipes repo:

https://gist.github.com/ebenolson/1682625dc9823e27d771

BTW, in order for that code to work now, get_output_shape must be replaced with output_shape

Cheers,
Chris

Cannot run CIFAR 10 example properly on gpu

It seems that I cannot run CIFAR 10 example properly on gpu (runs on cpu anyway), and that the problem is with Lasagne somehow, as I do not have similar problem with Keras.

Here is my log of the example on gpu:

$ THEANO_FLAGS=device=gpu python Deep_Residual_Learning_CIFAR-10.py 
Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 5005)
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
8.12522697449
7.34526205063
7.88604187965
7.31621003151
7.64691305161

the numbers in the end indicate time in seconds needed to process a single batch. In particular, I time the line train_err += train_fn(inputs, targets).

This is the log without gpu (default in my case):

$ python Deep_Residual_Learning_CIFAR-10.py 
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
7.94606494904
7.46665215492
7.50057983398
7.23336601257
7.53180503845

I observe is that gpu is mostly idle during training, nvidia-smi constantly gives something like this:

+------------------------------------------------------+                       
| NVIDIA-SMI 352.93     Driver Version: 352.93         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 0000:01:00.0      On |                  N/A |
| 26%   65C    P5    40W / 250W |   1971MiB / 12287MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       922    G   ...ves-passed-by-fd --v8-snapshot-passed-by-   292MiB |
|    0       979    G   /usr/bin/X                                    1015MiB |
|    0      2420    G   compiz                                         506MiB |
|    0     22479    C   python                                         127MiB |
+-----------------------------------------------------------------------------+

however all cores of cpu are busy with computations, even when I intend to run on gpu.

My OS is Ubuntu 15. I have installed bleeding edge Theano.
I can run stuff with Thano + Keras on gpu withTHEANO_FLAGS=device=gpu python ... - I can see that gpu is loaded and observe speedup.
Please let me know if there indeed is some sort of bug or if this is normal. Any help is greatly appreciated!

Would you be interested in a stereo convolutional network example?

Hi,
I have been working on a project recently that estimates depth map from a pair of stereo images using a fully convolutional network (https://github.com/LouisFoucard/StereoConvNet). Would you be interested in adding that as an example? It is fairly short, and it uses batch normalization so that additionally could be an example that shows implementation of batch normalization.
The data (stereo images and depth maps) used to trained the network is obtained by generating random 3d scenes with Blender, and is also available on github (https://github.com/LouisFoucard/DepthMap_dataset).
Please let me know if that is of interest to you, and I can clean it up a bit, add some more comments.

vgg without gpu

All networks in the zoo use DNN. Is there a way to use a pretrained model if I don't have GPU installed ?

Add external links to README?

We have a wiki page on the main library repository with a bunch of links to Lasagne extensions, as well as code that makes use of the library: https://github.com/Lasagne/Lasagne/wiki/3rd-party-extensions-and-code

The page isn't very up to date and I think a lot of people don't know it exists. Perhaps we should start putting these links into the README of this repository instead. People looking for Lasagne-based code / extensions will be must more likely to find it that way (it'll be on the front page when they visit this repo).

Alternatively we could have a separate text file (or text files) with external links + descriptions in this repo somewhere.

Thoughts?

Lasagne Inception CIFAR-10 Example

Do you want to add this to the examples folder?

https://github.com/ieee8023/NeuralNetwork-Examples/blob/master/theano/cnn/lasagne-cifar10-inception-example.ipynb

Preactivation ResNets

Hello all

Hopefully this is the appropriate place to post this.

I've been working on this off and on since the Preactivation ResNet paper was published (https://arxiv.org/abs/1603.05027) and I finally think I flushed out the bugs in my code. Would this be something worth adding to the Lasagne Recipes? It would be easy to copy and paste my models into the existing ResNet example so that everything is kept consistent.

My code:
https://github.com/FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne

ImageNet Pretrained Network CPU/GPU

Hi all,

I tried the great example of ImageNet Pretrained Network (VGG_S).ipynb.
https://github.com/Lasagne/Recipes/blob/master/examples/ImageNet%20Pretrained%20Network%20(VGG_S).ipynb

With GPU I get the same result in the tutorial.
But with a CPU I am getting different results.

For example instead of "German Shepherd" the net "thinks" its "Border terrier"

Why is that?
Can I fix it?

Best
Eitan

Variational Autoencoder: cannot understand why there is a '2' coefficient for log sigma

Hi,

this is the line 190 of variational_autoencoder.py:

- 0.5 * T.sqr(tgt - mu) / T.exp(2 * ls))

where does that 2 coefficient for the log sigma come from? I did the derivations myself and I could not find it. This other implementation: https://github.com/y0ast/Variational-Autoencoder/blob/master/VAE.py does not include that multiplier. Any explanation? Is it a bug?

Wrong pretrained weights for UNet example

Hi,
seems like I gave you the wrong link when I asked you to update the weights for the UNet example on AWS. I am very sorry for that!
https://www.dropbox.com/s/h4fhpzeqzgxl4qw/UNet_params_pretrained.zip?dl=0
Those are the correct weights. I would be very happy if you could upload them!
Cheers,
Fabian

The speed of Deep_Residual_Learning_CIFAR-10.py

I am using Ubuntu 14.04 with a titan x GPU and cudnn v4.
When I run the Deep_Residual_Learning_CIFAR-10.py script
The results are as follows

Using gpu device 3: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 4007)
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
Epoch 1 of 82 took 194.576s
  training loss:                1.885696
  validation loss:              1.185959
  validation accuracy:          58.67 %
Epoch 2 of 82 took 192.870s
  training loss:                1.241701
  validation loss:              0.860152
  validation accuracy:          70.31 %
Epoch 3 of 82 took 196.294s
  training loss:                0.969671
  validation loss:              0.767420
  validation accuracy:          75.22 %
Epoch 4 of 82 took 193.210s
  training loss:                0.831287
  validation loss:              0.929070
  validation accuracy:          72.58 %
Epoch 5 of 82 took 194.127s
  training loss:                0.757095
  validation loss:              0.628573
  validation accuracy:          79.36 %
Epoch 6 of 82 took 194.738s
  training loss:                0.710511
  validation loss:              0.578690
  validation accuracy:          80.73 %

Is this running speed normal?

Using VGG nets

I'm trying to use the VGG-16 net with pretrained weights.

The link https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/vgg16.pkl does not seem to be public?

@ebenolson : I can download the file if I log in with the information you gave me.

I'm not sure how i should pre-process my data to make the model work. I looked at the preprocessing description in the repo:

In the paper, the model is denoted as the configuration D trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68].

I guess i should do something like (not tested):

MEAN_VALUE = np.array([103.939, 116.779, 123.68])   # BGR
def preprocess(img):
    # img is (channels, height, width), values are 0-255
    img = img[::-1]  # switch to BGR
    img -= MEAN_VALUE
    return img

Maybe we should at preprocess functions to the modelzoo?

auto encoder

Hi, I am trying to train an autoencoder using sptn. I initially thought it could be used in isolation. But in that case backpropagation does not seem to work. So I attached a final layer which is linear. But still it seems like sptn does not bring any thing compared to a single dense layer. Any ideas ??

ini = lasagne.init.HeUniform()
l_in = lasagne.layers.InputLayer(shape=(None, 1, input_width, input_height))
# localization part
b = np.zeros((2, 3), dtype=theano.config.floatX)
b[0, 0] = 1
b[1, 1] = 1
b = b.flatten()
loc_l1 = pool(l_in, pool_size=(2, 2))
loc_l2 = conv(loc_l1, num_filters=20, filter_size=(5, 5), W=ini)
loc_l3 = pool(loc_l2, pool_size=(2, 2))
loc_l4 = conv(loc_l3, num_filters=20, filter_size=(7, 7), W=ini)
loc_l5 = lasagne.layers.DenseLayer(loc_l4, num_units=50, W=lasagne.init.HeUniform('relu'))
loc_out = lasagne.layers.DenseLayer(loc_l5, num_units=6, b=b, W=lasagne.init.Constant(0.0),
nonlinearity=lasagne.nonlinearities.identity)

# transformer
l_trans1 = lasagne.layers.TransformerLayer(l_in, loc_out, downsample_factor=1)

l_enc2 = lasagne.layers.DenseLayer(l_trans1,num_units=input_width * input_height,nonlinearity=lasagne.nonlinearities.linear, name ='final')

l_out = lasagne.layers.ReshapeLayer(l_enc2, shape=(-1, 1, input_width, input_height))

Is there any residual network example on ImageNet?

Hi,
I'm very delighted find a residual network example on cifar10
https://github.com/Lasagne/Recipes/blob/master/papers/deep_residual_learning/Deep_Residual_Learning_CIFAR-10.py

Is there any residual network examples on ImageNet?

Thanks!

Broken links in Video features with C3D.ipynb example

@gyglim I am trying to run the Video features with C3D.ipynb example but the links are either broken or the access is forbidden. Would be possible to upload the files to some other server? Thanks!

lasagne / recipes Goto Github PK

recipes's Issues

Recommend Projects

Recommend Topics

Recommend Org