lasagne / recipes Goto Github PK
View Code? Open in Web Editor NEWLasagne recipes: examples, IPython notebooks, ...
License: MIT License
Lasagne recipes: examples, IPython notebooks, ...
License: MIT License
The vgg16 pretrained model claims to be for non-commercial use only, while the VGGNet website says "The models are released under Creative Commons Attribution License."
What is the reason for the limitation?
The link to the ResNet50 lasagne weights appears to be broken or not having the right permissions.
i.e the download link in https://github.com/Lasagne/Recipes/blob/master/modelzoo/resnet50.py doesn't work (https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/resnet50.pkl)
Cheers
Casper
Hi, I made script for transferring weights from caffe ImageNet pretrained ResNet-50 to lasagne https://github.com/mephistopheies/resnet50caffe2lasagne/blob/master/resnet-50.ipynb, like in existing one https://github.com/Lasagne/Recipes/blob/master/examples/Using%20a%20Caffe%20Pretrained%20Network%20-%20CIFAR10.ipynb for VGG.
Would it be useful to have such recipe script for "modelzoo" folder? and/or "examples" folder?
Run the model (cifar_model_n5.npz) of the saved here file. I get the following results
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
testing phase!
Final results:
test loss: 3.326520
test accuracy: 40.55 %
I do not understand why such a low accuracy.
i tried to run the example of model conversion using the exact same code from the notebook, only 0.406 accurcy is reported instead of 0.894. Wonder if something is broken (some intermediate results are not the same, e.g. conv1).
I hope this is the appropriate venue to post this. I don't have an implementation yet, but maybe this ticket could encourage some work.
I am currently interested in this stochastic depth paper:
http://arxiv.org/pdf/1603.09382v2.pdf
I was going to have a go in implementing this, but I was a bit stumped as to how one would go about the identity transform that is mentioned in equation (2). As you can see, if the next layer and the current layer have different output shapes, you need to linearly project the output of the current layer so that it matches the dimensions of the output of the following layer. I'm not clear on how this is done and am afraid it's blatently obvious... is your "projection matrix" (or whatever it's called) a matrix (of some appropriate shape) consisting solely of ones? Furthermore, how would we do this for convolution networks?
It seems like that's the only roadblock for me -- the binomial mask is easy to do.
Let me know what you think.
PS: Interesting, I found a post asking on how to go about implementing this, but it seems to omit the identity transform:
Are anyone interested in CTC?
I ported a theano CTC to lasagne: https://github.com/skaae/Lasagne-CTC
Problem is I haven't really had time to test it and I do not have a nice dataset for testing.
Just thought we could collect some ideas for things that'd be reasonably easy to turn into Lasagne Recipes. One of them is Ben Graham's "Index-learning of unsupervised low dimensional embeddings", available at: http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/graham/indexlearning.pdf
Anybody who's interested in doing this, raise your hand here! :)
Hi all,
Just tried the inception model and couldn't get it to work. It always gives a very low accuracy.
1, I checked the prerained weights, and there are negative values in the mean of batch norm. However, batch norm is after the convolution layer with ReLU units which always produces nonnegative values. So the mean should be always nonnegative. Why the mean in the pretrained weights is negative?
2, In the Lasagne document shown as the following, batch norm should be used after the convolution and before ReLU. Is this the case or should it be used after ReLU and before convolution of next layer? Because going through ReLU will always suppress the negative values, it would not make much sense in inserting between convolution and ReLU?
"This layer should be inserted between a linear transformation (such as a DenseLayer, or Conv2DLayer) and its nonlinearity. The convenience function batch_norm() modifies an existing layer to insert batch normalization in front of its nonlinearity."
3, There are 1008 output units in the model, while there are only 1000 classes in ImageNET. From this line (https://github.com/soumith/inception.torch/blob/master/googlenet.lua#L113), I found that only the 2-1001 classes are used for ImageNET. Is this true? The I used the following operation. Just want to check with you guys that this is true.
T.set_subtensor(net['softmax'].W[0:1000], inceptionmodel_param_values[470][1:1001])
T.set_subtensor(net['softmax'].b[0:1000], inceptionmodel_param_values[471][1:1001])
Thanks a lot.
Hi all,
As shown in the subject, I cannot get the accuracy reported by their paper with the pretrained weights, for both VGG and inception_v3 (from ImageNET classification). It is always about 3% lower. For vgg, it is about 69% (while reported 72.7%), and for inception_v3, it is about 78% (while reported 81.3%).
It is driving me crazy. Any advice would be highly appreciated.
My setting,
1, Data is preprocessed using caffe to produce lmdb. It is resized to 256 for vgg and 384 for inception.
2, When testing, the data is oversampled with 10 crops (4 corners with center crop, plus flip). This is slightly different from the paper, but it should not cause 3% difference. It only improves about 1% using oversample compared to using the center crop. Oversample code is attached at the end.
3, The pretrained weights are downloaded from the model zoo.
4, Model is tested using GPU with theano configuration as THEANO_FLAGS='floatX=float32,device=gpu0,mode=FAST_RUN, nvcc.fastmath=True'.
Oversampling code:
datum.ParseFromString(value)
label = datum.label
img = np.array(bytearray(datum.data)).reshape(datum.channels, datum.height, datum.width)
for oversamplei in range(5):
dx=self.cropindex[oversamplei][0]
dy=self.cropindex[oversamplei][1]
tempimg = img[:,dy:dy+self.crop_height,dx:dx+self.crop_width]
for flipi in range(2):
if flipi==1:
tempimg = tempimg[:,:,::-1]
self.data_batches[i*10+oversamplei*2+flipi,:,:,:] = tempimg-BGR_mean
self.labels_batches[i*10+oversamplei*2+flipi] = np.int32(label)
Test code:
test_vggprediction = lasagne.layers.get_output(vggmodel['prob'], X_sym, deterministic=True)
_,vggprediction_shape=test_vggprediction.shape
temptest_vggprediction=test_vggprediction.reshape((-1,10,vggprediction_shape))
lable_oversample=y_sym[::10]
test_vggprediction_oversample=T.mean(temptest_vggprediction,axis=1,dtype=theano.config.floatX)
test_vggacc=T.mean(lasagne.objectives.categorical_accuracy(test_vggprediction_oversample, lable_oversample, top_k=1),dtype=theano.config.floatX)
test_vggacc_top5=T.mean(lasagne.objectives.categorical_accuracy(test_vggprediction_oversample, lable_oversample, top_k=5),dtype=theano.config.floatX)
Again, any advice would be highly appreciated.
Dears,
I tried https://github.com/Lasagne/Recipes/blob/master/examples/spatial_transformer_network.ipynb
but I get an error about types :
X = T.tensor4()
y = T.ivector()
# training output
output_train = lasagne.layers.get_output(model, X, deterministic=False)
# evaluation output. Also includes output of transform for plotting
output_eval, transform_eval = lasagne.layers.get_output([model, l_transform], X, deterministic=True)
sh_lr = theano.shared(lasagne.utils.floatX(LEARNING_RATE))
cost = T.mean(T.nnet.categorical_crossentropy(output_train, y))
updates = lasagne.updates.adam(cost, model_params, learning_rate=sh_lr)
train = theano.function([X, y], [cost, output_train], updates=updates)
eval = theano.function([X], [output_eval, transform_eval])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-c9e0c22f04d4> in <module>()
12 updates = lasagne.updates.adam(cost, model_params, learning_rate=sh_lr)
13
---> 14 train = theano.function([X, y], [cost, output_train], updates=updates)
15 eval = theano.function([X], [output_eval, transform_eval])
/usr/local/lib/python2.7/site-packages/theano/compile/function.pyc in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
315 on_unused_input=on_unused_input,
316 profile=profile,
--> 317 output_keys=output_keys)
318 # We need to add the flag check_aliased inputs if we have any mutable or
319 # borrowed used defined inputs
/usr/local/lib/python2.7/site-packages/theano/compile/pfunc.pyc in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
487 rebuild_strict=rebuild_strict,
488 copy_inputs_over=True,
--> 489 no_default_updates=no_default_updates)
490 # extracting the arguments
491 input_variables, cloned_extended_outputs, other_stuff = output_vars
/usr/local/lib/python2.7/site-packages/theano/compile/pfunc.pyc in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
202 ' function to remove broadcastable dimensions.')
203
--> 204 raise TypeError(err_msg, err_sug)
205 assert update_val.type == store_into.type
206
TypeError: ('An update must have the same type as the original shared variable (shared_var=<CudaNdarrayType(float32, vector)>, shared_var.type=CudaNdarrayType(float32, vector), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, vector)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')
Where could this come from ?
Best
I have Lasagne versions of vgg-16, vgg-19 and googlenet with their pretrained parameters:
http://www.vlfeat.org/matconvnet/pretrained/
https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet
Would there be interest in adding these recipes? I reckon these recipes are more the microwave lasagne kind, but this might interest people too?
It's a common pitfall that the model zoo defines its models to use the Conv2DDNNLayer
, implicitly defaulting to flip_filters=False
. When people modify it to use the Conv2DLayer
(which defaults to flip_filters=True
), the models don't work as expected.
With the bleeding-edge version of Lasagne, there are three options to resolve this:
flip_filters=False
to the constructorsConv2DLayer
and explicitly add flip_filters=False
to the constructors (should be the same performance)Conv2DLayer
with the default flip_filters=True
setting and update the pickle files accordinglyedit: Looks like I misread.
Is it worth implementing the fast R-CNN and faster R-CNN with Lasagne? Both are state-of-the-art image detection methods.
My current guess is faster R-CNN is easier since it purely relies on two networks (one for proposal, another for detection) and would like to work on it. Anyone has tried that?
Hi, I am currently modifying the saliency maps of
so that I can plot out the saliencies for all layers and every 8 filters in the vgg net. Here is what I did, but the function compilation stage is prohibitively slow. It seems to me that the gradient loop did not properly exploit the stacked structure of the vgg net and has to go through the graph every single time.
I am just wondering is there a better way to do it? Thanks!
def compile_saliency_function1(net,layernamelist,layershapelist,scalefactor):
inp = net['input'].input_var
outp = lasagne.layers.get_output([net[layername] for layername in layernamelist], deterministic=True)
saliencyfnlist=[]
for layeri in range(len(layernamelist)):
filtercount=int(layershapelist[layeri]/scalefactor)
filterindices=[ii*scalefactor for ii in range(filtercount)]
layeroutp=outp[layeri]
saliencylayerlist=[]
for filterindex in filterindices:
max_outpi=layeroutp[0,filterindex,]
saliencylayerlist.append(theano.grad(max_outpi.sum(), wrt=inp))
print(len(saliencylayerlist))
layerfnlist=theano.function([inp], saliencylayerlist)
saliencyfnlist.append([layerfnlist])
return saliencyfnlist
starttime=time.time()
saliencyfntuple=compile_saliency_function1(net,['conv5_1','conv5_2','conv5_3'],[512,512,512],8)
print('fn time',time.time()-starttime)
In the function build_simple_block
in resnet50.py, the stride
and pad
arguments to the Conv2DLayer
(ConvLayer
) constructor are given in the order pad
then stride
(if use_bias
is True
, not if use_bias
is False
), but the Conv2DLayer
constructor takes those two arguments in the order stride
then pad
.
@gyglim
I was wondering if you could share your code on training the C3D (e.g. training dataset, optimizer). I am trying to train C3D on my own data but either get NaN or memory problems. How did you manage to train such a massive network? Thanks
@benanne @dnouri @craffel @f0k @skaae (and anyone else of course)
With first release imminent it would be nice to have a bit more here... I know a couple of you have stuff written up already, but I bet everyone has some suitable code lying around.
If you have anything you're willing to contribute, please open a PR... don't worry if it's not perfect, I can take care of making sure everything functions with the latest Lasagne.
Hi,
How I do training for RGB image input. Do I need 3 input layers? If I use '3' instead of shape[1] does this mean, all channels are trained in isolation? Can I use more than 3 channels?
ini = lasagne.init.HeUniform()
l_in = lasagne.layers.InputLayer(shape=(None, 1, input_width, input_height))
I found the nonlinearity in Conv2DLayer of densenet.py and densenet_fast.py is None. But in the doc of lasagne, it says that "nonlinearity : callable or None
The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear."
So, why the activation in Conv2DLayer is linear instead of relu?
I'm trying to run the Art Style Transfer
Because I don't have an nVidia GPU I changed the line:
from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer
to
from lasagne.layers import Conv2DLayer as ConvLayer
With the latest Lasagne (6674ed8a1ed6d4ed4c11e42a1cd809f8a84770c6) and Theano 0.8.2 installed via pip this runs without error, but produces noise for the optimized images:
My guess is that the pre-trained weights are not being applied correctly. But I'm not sure how to verify or fix this.
Seems related to previous issues
I'm having a mismatch problem when trying to set the weights at the model.
At first it was using the cuda implementation, but my computer doesn't have a cuda-enabled GPU, and then I've changed the conv model to the cpu implementation (Conv2DLayer).
Maybe can't I use "pad=1" in a Conv2DLayer?
from lasagne.layers import InputLayer, DenseLayer, DropoutLayer, Conv2DLayer as ConvLayer
# from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer
In [5]: ### Load the model parameters and metadata
import pickle
model = pickle.load(open('vgg_cnn_s.pkl'))
CLASSES = model['synset words']
MEAN_IMAGE = model['mean image']
lasagne.layers.set_all_param_values(output_layer, model['values'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-ec94c9846d22> in <module>()
5 MEAN_IMAGE = model['mean image']
6
----> 7 lasagne.layers.set_all_param_values(output_layer, model['values'])
/Users/mac/Sites/python/Diseaselyzer/venv/lib/python2.7/site-packages/lasagne/layers/helper.pyc in set_all_param_values(layer, values, **tags)
450 raise ValueError("mismatch: parameter has shape %r but value to "
451 "set has shape %r" %
--> 452 (p.get_value().shape, v.shape))
453 else:
454 p.set_value(v)
ValueError: mismatch: parameter has shape (12800, 4096) but value to set has shape (18432, 4096)
Thanks in advance \o
hello,i have try incetion_v3 for fine tune ,but the accurary is really low,worse than googlenet and vgg19,can u tell me the details about ur pretrained model,like learning_rate,batch_size,data preprocess etc..
As discussed in #16, it might be nice to make this repo (or a subset) into a package, so that things like pretrained models can be imported into users code.
Anyone have arguments against? Or feel like making a PR?
In the googlenet example, is it possible that the dropout after the average pool layer is missing? Compare to Szegedy et al., "Going deeper with convolutions", page 5, table 1.
For classification problems, I use the model vgg. I have a computer with Debian 6.0.9, 32 bit, RAM 15.8Gb. When I try to load a model vgg19 or vgg16. I receive a memory error
cnn_vgg19 = build_model_vgg19()
output_layer = cnn_vgg19['prob']
MemoryError Traceback (most recent call last)
in ()
----> 1 cnn_vgg19 = build_model_vgg19()
2 output_layer = cnn_vgg19['prob']in build_model_vgg19()
39 net['conv5_3'], 512, 3, pad=1, flip_filters=False)
40 net['pool5'] = PoolLayer(net['conv5_4'], 2)
---> 41 net['fc6'] = DenseLayer(net['pool5'], num_units=4096)
42 net['fc6_dropout'] = DropoutLayer(net['fc6'], p=0.5)
43 net['fc7'] = DenseLayer(net['fc6_dropout'], num_units=4096)/home/roman/anaconda/lib/python2.7/site-packages/lasagne/layers/dense.pyc in init(self, incoming, num_units, W, b, nonlinearity, **kwargs)
69 num_inputs = int(np.prod(self.input_shape[1:]))
70
---> 71 self.W = self.add_param(W, (num_inputs, num_units), name="W")
72 if b is None:
73 self.b = None/home/roman/anaconda/lib/python2.7/site-packages/lasagne/layers/base.pyc in add_param(self, spec, shape, name, **tags)
212 name = "%s.%s" % (self.name, name)
213 # create shared variable, or pass through given variable/expression
--> 214 param = utils.create_param(spec, shape, name)
215 # parameters should be trainable and regularizable by default
216 tags['trainable'] = tags.get('trainable', True)/home/roman/anaconda/lib/python2.7/site-packages/lasagne/utils.pyc in create_param(spec, shape, name)
349
350 elif hasattr(spec, 'call'):
--> 351 arr = spec(shape)
352 try:
353 arr = floatX(arr)/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in call(self, shape)
29 their :meth:sample()
method.
30 """
---> 31 return self.sample(shape)
32
33 def sample(self, shape):/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in sample(self, shape)
175
176 std = self.gain * np.sqrt(2.0 / ((n1 + n2) * receptive_field_size))
--> 177 return self.initializer(std=std).sample(shape)
178
179/home/roman/anaconda/lib/python2.7/site-packages/lasagne/init.pyc in sample(self, shape)
98 def sample(self, shape):
99 return floatX(get_rng().uniform(
--> 100 low=self.range[0], high=self.range[1], size=shape))
101
102mtrand.pyx in mtrand.RandomState.uniform (numpy/random/mtrand/mtrand.c:13575)()
mtrand.pyx in mtrand.cont2_array_sc (numpy/random/mtrand/mtrand.c:2902)()
MemoryError:
There is a second computer RAM 32 Gb 64bit
on this computer is working correctly
i.e. 16 GB RAM is not enough
is it normal?
I have a few questions about the notebook with the implementation of "A Neural Algorithm for Artistic Style". Hopefully this is the right place for them? It'd be easier to work with and make pull requests if this was a script rather than a notebook, but it's up to you.
Overall I think this is by far the prettiest implementation I've seen of the algorithm, and it's been a pleasure to work with. My questions:
+2
to width and height?lbfgs
in scipy to be quite unstable (compared to the one in Torch used by Justin's implementation), as it often returns the error below. This seems to be quite random depending on image size/parameters, and adding new features to the algorithm isn't helping. Any ideas?Bad direction in the line search;
refresh the lbfgs memory and restart the iteration.
* * *
N Tit Tnf Tnint Skip Nact Projg F
***** 211 259 2 0 0 1.204D-03 2.472D+03
F = 2472.1531027869614
ABNORMAL_TERMINATION_IN_LNSRCH
Line search cannot locate an adequate point after 20 function
and gradient evaluations. Previous x, f and g restored.
Possible causes: 1 error in function or gradient evaluation;
2 rounding error dominate computation.
Thanks again for the code, I've been very impressed with Lasagne because of it!
I have experimented with Lasagne's googlenet.py and I have noticed that although it produces results similar to Caffe's bvlc_googlenet they are not exactly the same. (By results I mean the output of the network after the softmax or the probabilities)
Should they produce the same results?
i am trying to successfully compile the style transfer example here:
https://github.com/Lasagne/Recipes/blob/master/examples/styletransfer/Art%20Style%20Transfer.ipynb
installed cud on my OS X machine got through some other cudnn issues.
now i am stuck at this error "'Wrong number of dimensions: expected 4, got 3 with shape (768, 1024, 3).')"
the internet things it's something about numpy misshaped array.
? thanks for any advice?
/Users/stephan/anaconda/bin/python /Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py
Using gpu device 0: GeForce GT 750M (CNMeM is disabled, CuDNN 4007)
Traceback (most recent call last):
File "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py", line 150, in <module>
photo_features = {k: theano.shared(output.eval({input_im_theano: photo})) for k, output in zip(layers.keys(), outputs)}
File "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py", line 150, in <dictcomp>
photo_features = {k: theano.shared(output.eval({input_im_theano: photo})) for k, output in zip(layers.keys(), outputs)}
File "/Users/stephan/.local/lib/python3.5/site-packages/theano/gof/graph.py", line 523, in eval
rval = self._fn_cache[inputs](*args)
File "/Users/stephan/.local/lib/python3.5/site-packages/theano/compile/function_module.py", line 786, in __call__
allow_downcast=s.allow_downcast)
File "/Users/stephan/.local/lib/python3.5/site-packages/theano/tensor/type.py", line 177, in filter
data.shape))
TypeError: ('Bad input argument to theano function with name "/Users/stephan/PycharmProjects/antimodular/Art_Style_Transfer.py:150" at index 0(0-based)', 'Wrong number of dimensions: expected 4, got 3 with shape (768, 1024, 3).')
Process finished with exit code 1
I looked at four models of Recipes/modelzoo/
it's googlenet.py, vgg16.py, vgg19.py, vgg_cnn_s.py
For models of vgg16.py, vgg19.py, vgg_cnn_s.py the input layer is set so InputLayer((None, 3, 224, 224)) those the input image has a size 224_224. Is it correctly if to load models such as (vgg16, vgg19) but change the input layer i.e. for classification and retraining the images will be a other size(128_128 or other).
Model googlenet in the input layer set without dimensions InputLayer((None, 3, None, None))
This means that the input image can be any size?
Run the sample from Recipes/examples/ImageNet Pretrained Network (VGG_S).ipynb on the basis of this example I have similar programs for models vgg16 and vgg19. Models vgg 16 and vgg 19 has key 'mean value' therefore it is possible to calculate MEAN_IMAGE as in model vgg_s. Googlenet it has no parameter 'mean value' Нow to calculate the 'mean value' for the googlenet
I am running the Spatial Transformation example (https://github.com/Lasagne/Recipes/blob/master/examples/spatial_transformer_network.ipynb) downsample_factor =1.0 and I am getting the following error:
MemoryError: Error allocating 110231552 bytes of device memory (out of memory).
Apply node that caused the error: GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{Add}[(0, 0)].0)
Toposort index: 279
Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)]
Inputs shapes: [(1, 1, 1, 1), (256, 32, 58, 58)]
Inputs strides: [(0, 0, 0, 0), (107648, 3364, 58, 1)]
Inputs values: [<CudaNdarray object at 0x7fef7314c7f0>, 'not shown']
Outputs clients: [[GpuContiguous(GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}.0)]]
Can anyone replicate the same issue?
Hi I have a trained vgg16 model on 2 classes for which I am going to visualize the saliency map using guided backpropagation. could you please let me know how I can change my caffemodel to the pkl model which you have used in the corresponding notebook for guided backpropagation?
Or at least how I am able to change .py lasagne model to .pkl model?
Best
Hi,
not sure if it's an issue.
I am training the variational autoencoder with a different set of images with 3 color channels.
I am getting an increasingly negative loss. I wonder: is this a normal or valid outcome or is it a bug?
Shouldn't the loss value be always a positive amount? I am worried because the search for
a minimum of the loss function might not get to anything if the loss is not bounded by 0.
Building model and compiling functions...
L = 2, z_dim = 1, n_hid = 3, binary=True
Starting training...
Epoch 1 of 300 took 36.576s
training loss: 1193603.765134
validation loss: 358401.526396
Epoch 2 of 300 took 34.345s
training loss: 170094.748865
validation loss: -990985.720292
Epoch 3 of 300 took 34.682s
training loss: -948598.243076
validation loss: -2374793.240720
Epoch 4 of 300 took 33.571s
training loss: -2179357.580108
validation loss: -3822347.805930
Epoch 5 of 300 took 36.031s
training loss: -3293897.853456
validation loss: -5299324.057571
I tried with 3 or 1024 hidden units, and z dimension being either 1 or 2, but the result is the same.
Using the regular MNIST I have no issues: the loss value is positive and decreasing toward 0.
Hi,
It would be nice to have this as part of the Recipes repo:
https://gist.github.com/ebenolson/1682625dc9823e27d771
BTW, in order for that code to work now, get_output_shape must be replaced with output_shape
Cheers,
Chris
It seems that I cannot run CIFAR 10 example properly on gpu (runs on cpu anyway), and that the problem is with Lasagne somehow, as I do not have similar problem with Keras.
Here is my log of the example on gpu:
$ THEANO_FLAGS=device=gpu python Deep_Residual_Learning_CIFAR-10.py
Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 5005)
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
8.12522697449
7.34526205063
7.88604187965
7.31621003151
7.64691305161
the numbers in the end indicate time in seconds needed to process a single batch. In particular, I time the line train_err += train_fn(inputs, targets)
.
This is the log without gpu (default in my case):
$ python Deep_Residual_Learning_CIFAR-10.py
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
7.94606494904
7.46665215492
7.50057983398
7.23336601257
7.53180503845
I observe is that gpu is mostly idle during training, nvidia-smi constantly gives something like this:
+------------------------------------------------------+
| NVIDIA-SMI 352.93 Driver Version: 352.93 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 0000:01:00.0 On | N/A |
| 26% 65C P5 40W / 250W | 1971MiB / 12287MiB | 15% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 922 G ...ves-passed-by-fd --v8-snapshot-passed-by- 292MiB |
| 0 979 G /usr/bin/X 1015MiB |
| 0 2420 G compiz 506MiB |
| 0 22479 C python 127MiB |
+-----------------------------------------------------------------------------+
however all cores of cpu are busy with computations, even when I intend to run on gpu.
My OS is Ubuntu 15. I have installed bleeding edge Theano.
I can run stuff with Thano + Keras on gpu withTHEANO_FLAGS=device=gpu python ...
- I can see that gpu is loaded and observe speedup.
Please let me know if there indeed is some sort of bug or if this is normal. Any help is greatly appreciated!
Hi,
I have been working on a project recently that estimates depth map from a pair of stereo images using a fully convolutional network (https://github.com/LouisFoucard/StereoConvNet). Would you be interested in adding that as an example? It is fairly short, and it uses batch normalization so that additionally could be an example that shows implementation of batch normalization.
The data (stereo images and depth maps) used to trained the network is obtained by generating random 3d scenes with Blender, and is also available on github (https://github.com/LouisFoucard/DepthMap_dataset).
Please let me know if that is of interest to you, and I can clean it up a bit, add some more comments.
All networks in the zoo use DNN. Is there a way to use a pretrained model if I don't have GPU installed ?
We have a wiki page on the main library repository with a bunch of links to Lasagne extensions, as well as code that makes use of the library: https://github.com/Lasagne/Lasagne/wiki/3rd-party-extensions-and-code
The page isn't very up to date and I think a lot of people don't know it exists. Perhaps we should start putting these links into the README of this repository instead. People looking for Lasagne-based code / extensions will be must more likely to find it that way (it'll be on the front page when they visit this repo).
Alternatively we could have a separate text file (or text files) with external links + descriptions in this repo somewhere.
Thoughts?
Do you want to add this to the examples folder?
Hello all
Hopefully this is the appropriate place to post this.
I've been working on this off and on since the Preactivation ResNet paper was published (https://arxiv.org/abs/1603.05027) and I finally think I flushed out the bugs in my code. Would this be something worth adding to the Lasagne Recipes? It would be easy to copy and paste my models into the existing ResNet example so that everything is kept consistent.
My code:
https://github.com/FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne
Hi all,
I tried the great example of ImageNet Pretrained Network (VGG_S).ipynb.
https://github.com/Lasagne/Recipes/blob/master/examples/ImageNet%20Pretrained%20Network%20(VGG_S).ipynb
With GPU I get the same result in the tutorial.
But with a CPU I am getting different results.
For example instead of "German Shepherd" the net "thinks" its "Border terrier"
Why is that?
Can I fix it?
Best
Eitan
Hi,
this is the line 190 of variational_autoencoder.py:
- 0.5 * T.sqr(tgt - mu) / T.exp(2 * ls))
where does that 2
coefficient for the log sigma come from? I did the derivations myself and I could not find it. This other implementation: https://github.com/y0ast/Variational-Autoencoder/blob/master/VAE.py does not include that multiplier. Any explanation? Is it a bug?
Hi,
seems like I gave you the wrong link when I asked you to update the weights for the UNet example on AWS. I am very sorry for that!
https://www.dropbox.com/s/h4fhpzeqzgxl4qw/UNet_params_pretrained.zip?dl=0
Those are the correct weights. I would be very happy if you could upload them!
Cheers,
Fabian
I am using Ubuntu 14.04 with a titan x GPU and cudnn v4.
When I run the Deep_Residual_Learning_CIFAR-10.py script
The results are as follows
Using gpu device 3: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 4007)
Loading data...
Building model and compiling functions...
number of parameters in model: 464154
Starting training...
Epoch 1 of 82 took 194.576s
training loss: 1.885696
validation loss: 1.185959
validation accuracy: 58.67 %
Epoch 2 of 82 took 192.870s
training loss: 1.241701
validation loss: 0.860152
validation accuracy: 70.31 %
Epoch 3 of 82 took 196.294s
training loss: 0.969671
validation loss: 0.767420
validation accuracy: 75.22 %
Epoch 4 of 82 took 193.210s
training loss: 0.831287
validation loss: 0.929070
validation accuracy: 72.58 %
Epoch 5 of 82 took 194.127s
training loss: 0.757095
validation loss: 0.628573
validation accuracy: 79.36 %
Epoch 6 of 82 took 194.738s
training loss: 0.710511
validation loss: 0.578690
validation accuracy: 80.73 %
Is this running speed normal?
I'm trying to use the VGG-16 net with pretrained weights.
The link https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/vgg16.pkl does not seem to be public?
@ebenolson : I can download the file if I log in with the information you gave me.
I'm not sure how i should pre-process my data to make the model work. I looked at the preprocessing description in the repo:
In the paper, the model is denoted as the configuration D trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68].
I guess i should do something like (not tested):
MEAN_VALUE = np.array([103.939, 116.779, 123.68]) # BGR
def preprocess(img):
# img is (channels, height, width), values are 0-255
img = img[::-1] # switch to BGR
img -= MEAN_VALUE
return img
Maybe we should at preprocess functions to the modelzoo?
Hi, I am trying to train an autoencoder using sptn. I initially thought it could be used in isolation. But in that case backpropagation does not seem to work. So I attached a final layer which is linear. But still it seems like sptn does not bring any thing compared to a single dense layer. Any ideas ??
ini = lasagne.init.HeUniform()
l_in = lasagne.layers.InputLayer(shape=(None, 1, input_width, input_height))
# localization part
b = np.zeros((2, 3), dtype=theano.config.floatX)
b[0, 0] = 1
b[1, 1] = 1
b = b.flatten()
loc_l1 = pool(l_in, pool_size=(2, 2))
loc_l2 = conv(loc_l1, num_filters=20, filter_size=(5, 5), W=ini)
loc_l3 = pool(loc_l2, pool_size=(2, 2))
loc_l4 = conv(loc_l3, num_filters=20, filter_size=(7, 7), W=ini)
loc_l5 = lasagne.layers.DenseLayer(loc_l4, num_units=50, W=lasagne.init.HeUniform('relu'))
loc_out = lasagne.layers.DenseLayer(loc_l5, num_units=6, b=b, W=lasagne.init.Constant(0.0),
nonlinearity=lasagne.nonlinearities.identity)
# transformer
l_trans1 = lasagne.layers.TransformerLayer(l_in, loc_out, downsample_factor=1)
l_enc2 = lasagne.layers.DenseLayer(l_trans1,num_units=input_width * input_height,nonlinearity=lasagne.nonlinearities.linear, name ='final')
l_out = lasagne.layers.ReshapeLayer(l_enc2, shape=(-1, 1, input_width, input_height))
Hi,
I'm very delighted find a residual network example on cifar10
https://github.com/Lasagne/Recipes/blob/master/papers/deep_residual_learning/Deep_Residual_Learning_CIFAR-10.py
Is there any residual network examples on ImageNet?
Thanks!
@gyglim I am trying to run the Video features with C3D.ipynb example but the links are either broken or the access is forbidden. Would be possible to upload the files to some other server? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.