titu1994 / densenet Goto Github PK
View Code? Open in Web Editor NEWDenseNet implementation in Keras
License: MIT License
DenseNet implementation in Keras
License: MIT License
Hi, @titu1994 do you have the pretrained weights of DenseNet-BC-190-40 on CIFAR-10?
What's the meaning of "no-top"?
thx!
I ran Densenet.py for mnist dataset, But training got hanged with CPU utilization in 1st epoch itself, Kindly suggest a way to run densenet keras for mnist dataset, and advise of the procedure to run the same in GPU.
Memory-Efficient Implementation of DenseNets
https://arxiv.org/abs/1707.06990
Apparently the memory efficient densenet comes with a paper, it says pre-activation is the way to go. I wonder if TensorFlow has a way to do this memory-efficient version without implementing new ops.
I was wondering if you had a sense why you need to do a reshape here before applying the softmax.
Also, looking at several implementations and network diagrams, such as this:
It appears that in __denseblock needs to be modified a bit because the input is only connected to the output through conv blocks.
x_list = [x]
should be:
x_list = []
which should mean the relevant instances of [1:]
in the file can go away. I also think this also means concat_list does not need to be returned.
the same should apply for keras-contrib
Sorry for being a python noob but how does this implementation work? I tested and see that the code works as is to construct the model, but it builds the mode with a call to the DenseNet() function. My question is how does it do this even though the DenseNet function never makes a call to the other DenseNet building blocks functions (the ones prefixed with "__")? This is very confusing to me but amazing at the same time. PLEASE HELP!
I am training cifar10 dataset using cifar10.py script. I am able to start the training but the training doesn't go beyond 1 epoch. I am not getting any error messages. I waited for close to one hour for the training to move forward but nothing happens.
Is it because of the different versions of Keras/Theano? I am currently using keras version 2.0.2 and theano 0.9. And I am using TitanX gpu.
In the densenet FCN paper, the Diagram of Figure1 show no skip connection from input to output of the last denseblock in the decompression path.
But the function '__create_fcn_dense_net' use the skip connection like:
line 770 of densenet.py
x = Conv2D(nb_classes, (1, 1), activation='linear', padding='same', use_bias=False)(x_up)
Dose I mis-understand it? Or this operation is an improved one that I miss?
How do I train my own dataset?
last 10 Epoch:
Epoch 190/200
2113s - loss: 1.0719 - acc: 0.8601 - val_loss: 2.2472 - val_acc: 0.6207
Epoch 191/200
2113s - loss: 1.0691 - acc: 0.8607 - val_loss: 2.1733 - val_acc: 0.6445
Epoch 192/200
2114s - loss: 1.0706 - acc: 0.8597 - val_loss: 2.1769 - val_acc: 0.6439
Epoch 193/200
2113s - loss: 1.0750 - acc: 0.8585 - val_loss: 2.2456 - val_acc: 0.6286
Epoch 194/200
2113s - loss: 1.0639 - acc: 0.8622 - val_loss: 2.2660 - val_acc: 0.6455
Epoch 195/200
2113s - loss: 1.0679 - acc: 0.8607 - val_loss: 2.1948 - val_acc: 0.6376
Epoch 196/200
2114s - loss: 1.0676 - acc: 0.8609 - val_loss: 2.1855 - val_acc: 0.6522
Epoch 197/200
2113s - loss: 1.0652 - acc: 0.8618 - val_loss: 2.4428 - val_acc: 0.6053
Epoch 198/200
2114s - loss: 1.0675 - acc: 0.8603 - val_loss: 2.2936 - val_acc: 0.6236
Epoch 199/200
2114s - loss: 1.0685 - acc: 0.8589 - val_loss: 2.1497 - val_acc: 0.6450
Epoch 200/200
2113s - loss: 1.0635 - acc: 0.8626 - val_loss: 2.2698 - val_acc: 0.6251
Accuracy : 62.51
Error : 37.49
Hello, I get problem with the function '__dense_block'. I think this function cannot describe the structure as follow:
the code is def __dense_block(x, nb_layers, nb_filter, growth_rate, bottleneck=False, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True, return_concat_list=False):
concat_axis = 1 if K.image_data_format() == 'channels_first' else -1
x_list = [x]
for i in range(nb_layers):
cb = __conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay)
x_list.append(cb)
x = concatenate([x, cb], axis=concat_axis)
if grow_nb_filters:
nb_filter += growth_rate
if return_concat_list:
return x, nb_filter, x_list
else:
return x, nb_filter
I think this function did not implement the part of 'the output of the block is the concatenation of the outputs of the 4 layers, and thus contains 4 ∗ k feature maps'. Could you help me to figure it out?
I created a DenseNet model following the instructions in the README.md. I am using Keras with TF for the back-end to train a model (from scratch) to classify 240x240 images into one of 13 classes.
Attached is a simplified script that shows how I set up the model, generators, and optimizer. The code fails when I call the train_generator's fit_generator method. Attached is the output log.
Here are my package versions:
densenet_simplified.txt
densenet_output.txt
Any help in how I might change the configuration to use less memory (if that's what I am doing wrong) would be greatly appreciated.
Thanks,
Michael.
It seems there may be another change needed for the bottleneck case based on the paper:
We find this design es- pecially effective for DenseNet and we refer to our network with such a bottleneck layer, i.e., to the BN-ReLU-Conv(1× 1)-BN-ReLU-Conv(3×3) version of Hl, as DenseNet-B.
It looks like the network here and in keras contrib doesn't do that order. I think it should be:
def __conv_block(x, nb_filter, bottleneck=False, dropout_rate=None, weight_decay=1e-4):
'''
Adds a convolution layer (with batch normalization and relu),
and optionally a bottleneck layer.
# Arguments
x: Input tensor
nb_filter: integer, the dimensionality of the output space
(i.e. the number output of filters in the convolution)
bottleneck: if True, adds a bottleneck convolution block
dropout_rate: dropout rate
weight_decay: weight decay factor
# Input shape
4D tensor with shape:
`(samples, channels, rows, cols)` if data_format='channels_first'
or 4D tensor with shape:
`(samples, rows, cols, channels)` if data_format='channels_last'.
# Output shape
4D tensor with shape:
`(samples, filters, new_rows, new_cols)` if data_format='channels_first'
or 4D tensor with shape:
`(samples, new_rows, new_cols, filters)` if data_format='channels_last'.
`rows` and `cols` values might have changed due to stride.
# Returns
output tensor of block
'''
with K.name_scope('ConvBlock'):
concat_axis = 1 if K.image_data_format() == 'channels_first' else -1
if bottleneck:
inter_channel = nb_filter * 4
x = BatchNormalization(axis=concat_axis, epsilon=1.1e-5)(x)
x = Activation('relu')(x)
x = Conv2D(inter_channel, (1, 1), kernel_initializer='he_normal', padding='same', use_bias=False,
kernel_regularizer=l2(weight_decay))(x)
x = BatchNormalization(axis=concat_axis, epsilon=1.1e-5)(x)
x = Activation('relu')(x)
x = Conv2D(nb_filter, (3, 3), kernel_initializer='he_normal', padding='same', use_bias=False)(x)
if dropout_rate:
x = Dropout(dropout_rate)(x)
return x
What was the reason behind restricting sigmoid activation to the case where there is only one class? Wouldn't you need to use a sigmoid to do multiclass-multilabel classification?
Hello! I am facing a problem with the pre-trained model you are sharing. Whenever i try to load the weights (either for transfer learning or prediction) i get the following error message:
Exception: Layer #0 (named "initial_conv2D" in the current model) was found to correspond to layer convolution2d_1 in the save file. However the new layer initial_conv2D expects 1 weights, but the saved weights have 2 elements.
I can't understand if i am doing something wrong or if the problem is in the pretrained model...
can you help?
Thank you in advance and thank you for your contributions!
I have an 8gb GTX 1080 and I'm running cifar10.py on TensorFlow, but it seems to run out of memory very easily. Is this to be expected? Once I shut down the gui to free up every last ounce of gpu memory and reduced the batch size to 32 it did start running and now I'm up to epoch 7 at about 4 minutes per epoch which seems a bit faster than what you have.
However, this is just with cifar10, so will it even be possible to load and train the imagenet version of DenseNet-40-12 without a smaller network choice?
The enviroment that I run was:
tensoflow-gpu 1.4.0
keras 2.0.9
The problem occurred while I ran python cifar10.py
:
Traceback (most recent call last):
File "densenet.py", line 28, in <module>
from subpixel import SubPixelUpscaling
File "/home1/zhucheng/contest/ChallengeAI/ai_challenger_scene_train_20170904/DenseNet/subpixel.py", line 11, in <module>
import tensorflow_backend as K_BACKEND
File "/home1/zhucheng/contest/ChallengeAI/ai_challenger_scene_train_20170904/DenseNet/tensorflow_backend.py", line 6, in <module>
from keras.backend.tensorflow_backend import _postprocess_conv2d_output
ImportError: cannot import name _postprocess_conv2d_output
I am sure that I had installed the env well. Any advice for this? I am new to keras and tf.
I've just cloned the repo and executed cifar10_fast:
$ python cifar10_fast.py
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
File "cifar10_fast.py", line 29, in <module>
dropout_rate=dropout_rate)
File "/home/moose/GitHub/DenseNet/densenet_fast.py", line 129, in create_dense_net
weight_decay=weight_decay)
File "/home/moose/GitHub/DenseNet/densenet_fast.py", line 86, in dense_block
x = merge(feature_list, mode='concat', concat_axis=concat_axis)
File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1680, in merge
name=name)
File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1301, in __init__
self.add_inbound_node(layers, node_indices, tensor_indices)
File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 172, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors, mask=input_masks))
File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1394, in call
return K.concatenate(inputs, axis=self.concat_axis)
File "/home/moose/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1427, in concatenate
return tf.concat(axis, [to_dense(x) for x in tensors])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1047, in concat
dtype=dtypes.int32).get_shape(
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
as_ref=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.
In Densenet which has 12 layers,the second layers has 11 connections.I cannot find it.Please explain.Thanks.
Dear
I used following setting for my network but I can not get 76% accuracy on validation set , the optimizer is 'Adam ' and also I did mean-subtraction as preprocessing for R, G , B channel separately
.would you please help me ?
model = Sequential()
model.add(Conv2D(32, (3, 3), border_mode='same',batch_input_shape=(None,32,32,3)))
model.add(Activation('tanh'))
model.add(Conv2D(32, ( 3, 3),border_mode='same'))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3),border_mode='same'))
model.add(Activation('tanh'))
model.add(Conv2D(64, ( 3, 3),border_mode='same'))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense( 1024,init='normal'))
model.add(Activation('tanh'))
model.add(Dropout(0.25))
model.add(Dense( 10,))
model.add(Activation('softmax'))
Traceback (most recent call last):
File "D:/GitHub/DenseNet-master/imagenet_inference.py", line 26, in <module>
print('Predicted:', decode_predictions(preds))
File "C:\softwares\Anaconda3\envs\Keras\lib\site-packages\keras_applications\imagenet_utils.py", line 224, in decode_predictions
fpath = keras_utils.get_file(
AttributeError: 'NoneType' object has no attribute 'get_file'
kereas 2.2.4, what's the bug of this? How should i do for this bug? Thanks for you!
Line 514 in c197321
There is no Dropout and AvgPooling is being used instead of MaxPooling
Traceback (most recent call last):
File "cifar10.py", line 31, in <module>
growth_rate=growth_rate, nb_filter=nb_filter, dropout_rate=dropout_rate)
File "DenseNet/densenet.py", line 98, in DenseNet
include_top=include_top)
TypeError: _obtain_input_shape() got an unexpected keyword argument 'dim_ordering'
Could you offfer DenseNet_cifar100 weight?
I can't find this file in google.
Hi,
first of all, thanks for your work providing this Keras implementation of DenseNets.
I am trying to create a DenseNet model without the last layer using the instruction:
model = densenet.DenseNet((32, 32,3), depth=40, growth_rate=12, nb_filter=16,include_top=False)
but it produces the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "densenet.py", line 169, in DenseNet
model.load_weights(weights_path)
File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2572, in load_weights
load_weights_from_hdf5_group(f, self.layers)
File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/topology.py", line 3012, in load_weights_from_hdf5_group
g = f[name]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2687)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2645)
File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/h5py/_hl/group.py", line 166, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2687)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2645)
File "h5py/h5o.pyx", line 190, in h5py.h5o.open (/tmp/pip-build-YjgiH2/h5py/h5py/h5o.c:3440)
KeyError: "Unable to open object (Object 'dense_2' doesn't exist)"
It does not matter whether I use theano or tensorflow as backend.
Do you know how to solve this?
Thanks in advance.
Jónathan
I downloaded the DenseNet-40-12-Tensorflow-Backend-TF-dim-ordering.h5 file from the Releases page, compiled the model, loaded the weights, and tested it, but only got a 10.06% accuracy. Am I using the pretrained models wrong?
My code:
model = densenet.DenseNet(img_dim, classes=nb_classes, depth=depth, nb_dense_block=nb_dense_block,
growth_rate=growth_rate, nb_filter=nb_filter, dropout_rate=dropout_rate, weights=None)
optimizer = Adam(lr=1e-3) # Using Adam instead of SGD to speed up training
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=["accuracy"])
print("Finished compiling")
print("Building model...")
(trainX, trainY), (testX, testY) = cifar10.load_data()
testX = testX.astype('float32')
testX = densenet.preprocess_input(testX)
Y_test = np_utils.to_categorical(testY, nb_classes)
# Load model
weights_file="weights/DenseNet-40-12-Tensorflow-Backend-TF-dim-ordering.h5"
if os.path.exists(weights_file):
model.load_weights(weights_file, by_name=True)
print("Model loaded.")
yPreds = model.predict(testX)
yPred = np.argmax(yPreds, axis=1)
yTrue = testY
accuracy = metrics.accuracy_score(yTrue, yPred) * 100
error = 100 - accuracy
print("Accuracy : ", accuracy)
print("Error : ", error)
I have a dataset organised into folders like this:
`-Training_Set
--Class1
--img1.jpg
--img2.jpg
..
--Class2
--img101.jpg
--img102.jpg
..
--Class3
--img201.jpg
--img202.jpg
-Test_Set
--Class1
--img10.jpg
--img11.jpg
..
--Class2
--img150.jpg
--img140.jpg
..
--Class3
--img210.jpg
--img220.jpg`
and i want to upload the dataset, instead of the Cifar10 dataset in this line of code using Keras (Python) :
(trainX, trainY), (testX, testY) = cifar10.load_data()
What does L(L+1)/2 mean?
This is a very useful project for me, I spent some time to modify some details to make sure it can run on Keras to like add "require_flatten=include_top". However, if you can provide a Keras 2 version that will be better.
Keras has created a new official keras-contrib repository where they will accept a broader range of contributions than mainline keras, for eventual inclusion in mainline if it becomes widely used. Would you consider submitting your implementation?
Thank you for this implementation.
I want to know how to use this as transfer learning.
Would you please tell me step by step.
I just ran a 300 epoch run using tensorflow and an unmodified cifar10.py
from 54ed2d6 on a Titan X (old version) and got the following results:
Epoch 300/300
499/500 [============================>.] - ETA: 0s - loss: 0.0635 - acc: 0.9891Epoch 00299: val_acc did not improve
500/500 [==============================] - 181s - loss: 0.0635 - acc: 0.9891 - val_loss: 0.3646 - val_acc: 0.9224
Accuracy : 92.24
Error : 7.76
Here is the file:
DenseNet-40-12-CIFAR10.h5.zip
This definitely doesn't seem as good as previous training runs from the readme which cite 4.51 %
error.
Looking at the preprocess function I see:
x *= 0.017 # scale values
I thought it might be 1/128 but that is 0.0078125. I can't find anywhere else that something similar is done either in upstream implementations or in the keras preprocessing. Do you have information or reasoning regarding this scaling factor?
If i use the model you provided(DenseNet-40-12-Cifar10), the classification accuracy is 94.74%.
But i try to train a new model, i just uncomment "model.fit_generator(...)" line in Cifar10.py, after 200 epoches, accuracy is 88.3%(seen from the screen).
Is there something wrong or how to train a new model from original Cifar10 dataset?
Thanks!
The depth of densenet-264 should be 201 not 264
line 425, I guess without loading weights for it there were no errors :P
def __dense_block(x, nb_layers, nb_filter, growth_rate, bottleneck=False, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True, return_concat_list=False):
concat_axis = -1
x_list = [x]
for i in range(nb_layers):
cb = __conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay)
x_list.append(cb)
x = concatenate([x, cb], axis=concat_axis)
if grow_nb_filters:
nb_filter += growth_rate
I want to know if this is a error to pass the grouth_rate to __conv_block, Why not nb_filters?
Get ImportError: cannot import name _obtain_input_shape
with keras 2.2.2 as opposed to 2.2.0
I try "model = DenseNetImageNet121((224,224,3),classes=5, weights='imagenet' ,include_top=False)" and I have downloaded the 'DenseNet-BC-121-32-no-top.h5', so 'model.load_weights( 'DenseNet-BC-121-32-no-top.h5')' is executed.
But the error raised as blowe :
KeyError: "Unable to open object (Object 'global_average_pooling2d_1' doesn't exist)"
Anaconda3,pythin3.6,in Win7.
Dear @titu1994
I want to train a baseline module on cifar100, dense-40-12. But, I can not obtain performance like the paper. Could you share your hyper-parameters or pre-train module on cifar100?
Thanks a milion.
How is L(L+1)/2 implemented?
I ran the training script myself and after 400 epochs, via two runs of 200 epochs where I reloaded the weights on TensorFlow I had:
Accuracy : 89.38
Error : 10.62
Which seems very low compared to the 94.74% in the readme and paper.
Could it be that the accuracy reported for the CIFAR10 dataset is obtained using data augmentation?
This is somehow misleading because the accuracy reported in the paper doesn't use it.
Upload?
Hi Somshubra, thanks a lot for this great implementation of FC-DenseNets.
I believe that there may be a connection missing when you use include_top though. You are passing the skip block connection to the last convolutional layer, but I think that the correct approach would be to pass the full output of the last upsampling dense block.
I forked your code and made this change for my last project and got very nice results using a FC-DenseNet56 with sigmoid activation.
Take a look and see if you agree with me, I am happy to send a pull request and make the changes if you want.
Thanks
Dear @titu1994
I run you code, but I can not get good performance like the paper, Why?
The parameters are listed below:
batch_size = 64
nb_classes = 100
nb_epoch = 300
img_rows, img_cols = 32, 32
img_channels = 3
img_dim = (img_channels, img_rows, img_cols) if K.image_dim_ordering() == "th" else (img_rows, img_cols, img_channels)
depth = 40
nb_dense_block = 3
growth_rate = 12
nb_filter = 12
bottleneck = True
reduction = 0.2
dropout_rate = 0.2 # 0.0 for data augmentation
optimizer = Adam(lr=1e-2) # Using Adam instead of SGD to speed up training
The result is
loss: 2.3397 - acc: 0.4425 - val_loss: 2.5654 - val_acc: 0.4034
Accuracy : 40.34
Error : 59.66
Can you help me, thanks!
Hi there, I am interested in using DenseNet for training my dataset to perform object classification and localization. I am wondering if this DenseNet repo allows for transfer learning and also object localization with bounding boxes?
Quite interested in DenseNet as it uses features from all layers due to its concatenations throughout the network!
Thanks guys!
The standard cifar100 model only obtains about 60% validation accuracy when letting the standard script run for as many epochs as it wants before stopping early.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.