titu1994 / densenet Goto Github PK

View Code? Open in Web Editor NEW

707.0 29.0 295.0 20.29 MB

DenseNet implementation in Keras

License: MIT License

Python 100.00%

densenet densenet-model paper bottleneck deep-learning keras

densenet's Introduction

Dense Net in Keras

DenseNet implementation of the paper Densely Connected Convolutional Networks in Keras

Now supports the more efficient DenseNet-BC (DenseNet-Bottleneck-Compressed) networks. Using the DenseNet-BC-190-40 model, it obtaines state of the art performance on CIFAR-10 and CIFAR-100

Architecture

DenseNet is an extention to Wide Residual Networks. According to the paper:

The lth layer has l inputs, consisting of the feature maps of all preceding convolutional blocks. 
Its own feature maps are passed on to all L − l subsequent layers. This introduces L(L+1) / 2 connections 
in an L-layer network, instead of just L, as in traditional feed-forward architectures. 
Because of its dense connectivity pattern, we refer to our approach as Dense Convolutional Network (DenseNet).

It features several improvements such as :

Dense connectivity : Connecting any layer to any other layer.
Growth Rate parameter Which dictates how fast the number of features increase as the network becomes deeper.
Consecutive functions : BatchNorm - Relu - Conv which is from the Wide ResNet paper and improvement from the ResNet paper.

The Bottleneck - Compressed DenseNets offer further performance benefits, such as reduced number of parameters, with similar or better performance.

Take into consideration the DenseNet-100-12 model, with nearly 7 million parameters against with the DenseNet-BC-100-12, with just 0.8 million parameters. The BC model achieves 4.51 % error in comparison to the original models' 4.10 % error
The best original model, DenseNet-100-24 (27.2 million parameters) achieves 3.74 % error, whereas the DenseNet-BC-190-40 (25.6 million parameters) achieves 3.46 % error which is a new state of the art performance on CIFAR-10.

Dense Nets have an architecture which can be shown in the following image from the paper:

Performance

The accuracy of DenseNet has been provided in the paper, beating all previous benchmarks in CIFAR 10, CIFAR 100 and SVHN

Usage

Import the densenet.py script and use the DenseNet(...) method to create a custom DenseNet model with a variety of parameters.

Examples :

import densenet

# 'th' dim-ordering or 'tf' dim-ordering
image_dim = (3, 32, 32) or image_dim = (32, 32, 3)

model = densenet.DenseNet(classes=10, input_shape=image_dim, depth=40, growth_rate=12, 
			  bottleneck=True, reduction=0.5)

Or, Import a pre-built DenseNet model for ImageNet, with some of these models having pre-trained weights (121, 161 and 169).

Example :

import densenet

# 'th' dim-ordering or 'tf' dim-ordering
image_dim = (3, 224, 224) or image_dim = (224, 224, 3)

model = densenet.DenseNetImageNet121(input_shape=image_dim)

Weights for the DenseNetImageNet121, DenseNetImageNet161 and DenseNetImageNet169 models are provided (in the release tab) and will be automatically downloaded when first called. They have been trained on ImageNet. The weights were ported from the repository https://github.com/flyyufelix/DenseNet-Keras.

Requirements

Keras
Theano (weights not tested) / Tensorflow (tested) / CNTK (weights not tested)
h5Py

densenet's People

Contributors

Stargazers

Watchers

Forkers

vyraun xennygrimmato allensmile wanjinchang wind222 benjamesbabala peratham ahundt mayanxin89 fehiepsi juanlp asanakoy scatterbrain333 zhaoj9014 aiaihealthcare ivjia haythamassem superresolution jgraving miail florian42 bguisard qqgeogor wuqixiaobai nyk510 root-master collawolley omipan jianning-li kevin369ml minas1900 jiahengqi yuanhaogong shimmeringvoid rouseguy xc35 afelio2 jskdr tangxinkevin rui1996 chunfeima runngezhang fengyinyang antorsae mtyylx suppurlyn ieyer francisyizhang jrwin zhangxiaolin5213 roger1993 tawnkramer simmoncn mlsdd jiaenyue lizhangzhan alexliyang lonestar686 mschrimpf amirhk simeneide fendaq xiaozhg pbehr aaxwaz trojanxu aloshkad relh chcorbi lininglouis kangdekai grasin98 fitrialif fenggenb hongfel3 hhh920406 shihuai vonzunlei solertis szad670401 jiamery tonykuo222 wanke15 zhangyang5511 daibin88 caoyue19930616 qiongxiao nj2237 dl-deeplearning searchingmnist ivandrokin mmelodious yhancsx jimwi nick917 zhizhongchai joeblack22 elffer mystorytime liusecone

densenet's Issues

TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

I've just cloned the repo and executed cifar10_fast:

$ python cifar10_fast.py
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
  File "cifar10_fast.py", line 29, in <module>
    dropout_rate=dropout_rate)
  File "/home/moose/GitHub/DenseNet/densenet_fast.py", line 129, in create_dense_net
    weight_decay=weight_decay)
  File "/home/moose/GitHub/DenseNet/densenet_fast.py", line 86, in dense_block
    x = merge(feature_list, mode='concat', concat_axis=concat_axis)
  File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1680, in merge
    name=name)
  File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1301, in __init__
    self.add_inbound_node(layers, node_indices, tensor_indices)
  File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 172, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors, mask=input_masks))
  File "/home/moose/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1394, in call
    return K.concatenate(inputs, axis=self.concat_axis)
  File "/home/moose/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1427, in concatenate
    return tf.concat(axis, [to_dense(x) for x in tensors])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1047, in concat
    dtype=dtypes.int32).get_shape(
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

My System

Keras 1.2.1
Tensorflow 1.0.0

New Memory-Efficient Implementation of DenseNets Paper

Memory-Efficient Implementation of DenseNets
https://arxiv.org/abs/1707.06990

Apparently the memory efficient densenet comes with a paper, it says pre-activation is the way to go. I wonder if TensorFlow has a way to do this memory-efficient version without implementing new ops.

ImportError: cannot import name _postprocess_conv2d_output

The enviroment that I run was:
tensoflow-gpu 1.4.0
keras 2.0.9

The problem occurred while I ran python cifar10.py:

Traceback (most recent call last):
  File "densenet.py", line 28, in <module>
    from subpixel import SubPixelUpscaling
  File "/home1/zhucheng/contest/ChallengeAI/ai_challenger_scene_train_20170904/DenseNet/subpixel.py", line 11, in <module>
    import tensorflow_backend as K_BACKEND
  File "/home1/zhucheng/contest/ChallengeAI/ai_challenger_scene_train_20170904/DenseNet/tensorflow_backend.py", line 6, in <module>
    from keras.backend.tensorflow_backend import _postprocess_conv2d_output
ImportError: cannot import name _postprocess_conv2d_output

I am sure that I had installed the env well. Any advice for this? I am new to keras and tf.

Could you offfer DenseNet_cifar100 weight?

Could you offfer DenseNet_cifar100 weight?
I can't find this file in google.

average or max pooling in transition blocks?

DenseNet/densenet.py

Line 530 in c197321

x = AveragePooling2D((2, 2), strides=(2, 2))(x)

DenseNet/densenet_fast.py

Line 56 in c197321

x = AveragePooling2D((2, 2), strides=(2, 2))(x)

cifar100 Accuracy : 62.51?

last 10 Epoch:

Epoch 190/200
2113s - loss: 1.0719 - acc: 0.8601 - val_loss: 2.2472 - val_acc: 0.6207
Epoch 191/200
2113s - loss: 1.0691 - acc: 0.8607 - val_loss: 2.1733 - val_acc: 0.6445
Epoch 192/200
2114s - loss: 1.0706 - acc: 0.8597 - val_loss: 2.1769 - val_acc: 0.6439
Epoch 193/200
2113s - loss: 1.0750 - acc: 0.8585 - val_loss: 2.2456 - val_acc: 0.6286
Epoch 194/200
2113s - loss: 1.0639 - acc: 0.8622 - val_loss: 2.2660 - val_acc: 0.6455
Epoch 195/200
2113s - loss: 1.0679 - acc: 0.8607 - val_loss: 2.1948 - val_acc: 0.6376
Epoch 196/200
2114s - loss: 1.0676 - acc: 0.8609 - val_loss: 2.1855 - val_acc: 0.6522
Epoch 197/200
2113s - loss: 1.0652 - acc: 0.8618 - val_loss: 2.4428 - val_acc: 0.6053
Epoch 198/200
2114s - loss: 1.0675 - acc: 0.8603 - val_loss: 2.2936 - val_acc: 0.6236
Epoch 199/200
2114s - loss: 1.0685 - acc: 0.8589 - val_loss: 2.1497 - val_acc: 0.6450
Epoch 200/200
2113s - loss: 1.0635 - acc: 0.8626 - val_loss: 2.2698 - val_acc: 0.6251
Accuracy : 62.51
Error : 37.49

running out of memory

I have an 8gb GTX 1080 and I'm running cifar10.py on TensorFlow, but it seems to run out of memory very easily. Is this to be expected? Once I shut down the gui to free up every last ounce of gpu memory and reduced the batch size to 32 it did start running and now I'm up to epoch 7 at about 4 minutes per epoch which seems a bit faster than what you have.

However, this is just with cifar10, so will it even be possible to load and train the imagenet version of DenseNet-40-12 without a smaller network choice?

About implementation of __dense_block

Hello, I get problem with the function '__dense_block'. I think this function cannot describe the structure as follow:

the code is def __dense_block(x, nb_layers, nb_filter, growth_rate, bottleneck=False, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True, return_concat_list=False):

concat_axis = 1 if K.image_data_format() == 'channels_first' else -1

x_list = [x]

for i in range(nb_layers):
    cb = __conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay)
    x_list.append(cb)

    x = concatenate([x, cb], axis=concat_axis)

    if grow_nb_filters:
        nb_filter += growth_rate

if return_concat_list:
    return x, nb_filter, x_list
else:
    return x, nb_filter

I think this function did not implement the part of 'the output of the block is the concatenation of the outputs of the 4 layers, and thus contains 4 ∗ k feature maps'. Could you help me to figure it out?

poor cifar100 results

The standard cifar100 model only obtains about 60% validation accuracy when letting the standard script run for as many epochs as it wants before stopping early.

DenseNet-BC-190-40 weights

Hi, @titu1994 do you have the pretrained weights of DenseNet-BC-190-40 on CIFAR-10?

AttributeError: 'NoneType' object has no attribute 'get_file'

Traceback (most recent call last):
  File "D:/GitHub/DenseNet-master/imagenet_inference.py", line 26, in <module>
    print('Predicted:', decode_predictions(preds))
  File "C:\softwares\Anaconda3\envs\Keras\lib\site-packages\keras_applications\imagenet_utils.py", line 224, in decode_predictions
    fpath = keras_utils.get_file(
AttributeError: 'NoneType' object has no attribute 'get_file'

kereas 2.2.4, what's the bug of this? How should i do for this bug? Thanks for you!

Do you have a Keras 2 version？

This is a very useful project for me, I spent some time to modify some details to make sure it can run on Keras to like add "require_flatten=include_top". However, if you can provide a Keras 2 version that will be better.

cifar10 results

I just ran a 300 epoch run using tensorflow and an unmodified cifar10.py from 54ed2d6 on a Titan X (old version) and got the following results:

Epoch 300/300
499/500 [============================>.] - ETA: 0s - loss: 0.0635 - acc: 0.9891Epoch 00299: val_acc did not improve
500/500 [==============================] - 181s - loss: 0.0635 - acc: 0.9891 - val_loss: 0.3646 - val_acc: 0.9224
Accuracy :  92.24
Error :  7.76

Here is the file:
DenseNet-40-12-CIFAR10.h5.zip

This definitely doesn't seem as good as previous training runs from the readme which cite 4.51 % error.

Plans for memory efficient implementation in Keras?

Hi @titu1994,
I checked out @joeyearsleys 's repo but its limited to densenet-121 architecture, while your's has other architectures supported as well. Are you planning to add memory efficiency support on this?

fix the doc-string

DenseNet/densenet.py

Line 514 in c197321

    
           ''' Apply BatchNorm, Relu 1x1, Conv2D, optional compression, dropout and Maxpooling2D

There is no Dropout and AvgPooling is being used instead of MaxPooling

How can I retrain this model ?

Thank you for this implementation.
I want to know how to use this as transfer learning.
Would you please tell me step by step.

Densenet.py depth of 264 architecture

The depth of densenet-264 should be 201 not 264
line 425, I guess without loading weights for it there were no errors :P

About DenseNet

What does L(L+1)/2 mean?

TypeError: _obtain_input_shape() got an unexpected keyword argument 'dim_ordering'

Traceback (most recent call last):
  File "cifar10.py", line 31, in <module>
    growth_rate=growth_rate, nb_filter=nb_filter, dropout_rate=dropout_rate)
  File "DenseNet/densenet.py", line 98, in DenseNet
    include_top=include_top)
TypeError: _obtain_input_shape() got an unexpected keyword argument 'dim_ordering'

How to use 'DENSENET_121_WEIGHTS_PATH_NO_TOP'?

What's the meaning of "no-top"?

thx!

should input not be concatenated?

Also, looking at several implementations and network diagrams, such as this:

It appears that in __denseblock needs to be modified a bit because the input is only connected to the output through conv blocks.

x_list = [x]

should be:

x_list = []

which should mean the relevant instances of [1:] in the file can go away. I also think this also means concat_list does not need to be returned.

the same should apply for keras-contrib

How to upload my own dataset instead of the Cifar10 dataset

I have a dataset organised into folders like this:
`-Training_Set
--Class1
--img1.jpg
--img2.jpg
..
--Class2
--img101.jpg
--img102.jpg
..
--Class3
--img201.jpg
--img202.jpg

-Test_Set
--Class1
--img10.jpg
--img11.jpg
..
--Class2
--img150.jpg
--img140.jpg
..
--Class3
--img210.jpg
--img220.jpg`

and i want to upload the dataset, instead of the Cifar10 dataset in this line of code using Keras (Python) :

(trainX, trainY), (testX, testY) = cifar10.load_data()

I can not get good performance

Dear @titu1994

I run you code, but I can not get good performance like the paper, Why?

The parameters are listed below:

batch_size = 64
nb_classes = 100
nb_epoch = 300

img_rows, img_cols = 32, 32
img_channels = 3

img_dim = (img_channels, img_rows, img_cols) if K.image_dim_ordering() == "th" else (img_rows, img_cols, img_channels)
depth = 40
nb_dense_block = 3
growth_rate = 12
nb_filter = 12
bottleneck = True
reduction = 0.2
dropout_rate = 0.2 # 0.0 for data augmentation
optimizer = Adam(lr=1e-2) # Using Adam instead of SGD to speed up training

The result is
loss: 2.3397 - acc: 0.4425 - val_loss: 2.5654 - val_acc: 0.4034
Accuracy : 40.34
Error : 59.66

Can you help me, thanks!

Could you share Cifar100 pretrain module to us?

Dear @titu1994

I want to train a baseline module on cifar100, dense-40-12. But, I can not obtain performance like the paper. Could you share your hyper-parameters or pre-train module on cifar100?

Thanks a milion.

Submit Pull request to keras-contrib?

Keras has created a new official keras-contrib repository where they will accept a broader range of contributions than mainline keras, for eventual inclusion in mainline if it becomes widely used. Would you consider submitting your implementation?

https://github.com/farizrahman4u/keras-contrib

Implementation help

@titu1994 ,

Sorry for being a python noob but how does this implementation work? I tested and see that the code works as is to construct the model, but it builds the mode with a call to the DenseNet() function. My question is how does it do this even though the DenseNet function never makes a call to the other DenseNet building blocks functions (the ones prefixed with "__")? This is very confusing to me but amazing at the same time. PLEASE HELP!

preprocess scale 0.017

Looking at the preprocess function I see:

 x *= 0.017 # scale values

I thought it might be 1/128 but that is 0.0078125. I can't find anywhere else that something similar is done either in upstream implementations or in the keras preprocessing. Do you have information or reasoning regarding this scaling factor?

Failure using Keras' fit_generator attempting to train image classifier

I created a DenseNet model following the instructions in the README.md. I am using Keras with TF for the back-end to train a model (from scratch) to classify 240x240 images into one of 13 classes.

Attached is a simplified script that shows how I set up the model, generators, and optimizer. The code fails when I call the train_generator's fit_generator method. Attached is the output log.

Here are my package versions:

Python 2.7.5
cv2 3.3.0
numpy 1.13.3
Keras 2.0.8
Tensorflow 1.2.1
h5py 2.7.1
NVCC release 8.0, V8.0.61
tensoflow-gpu 1.2.1

densenet_simplified.txt
densenet_output.txt

Any help in how I might change the configuration to use less memory (if that's what I am doing wrong) would be greatly appreciated.

Thanks,

Michael.

'global_average_pooling2d_1' doesn't exist in 'DenseNet-BC-121-32-no-top.h5'

I try "model = DenseNetImageNet121((224,224,3),classes=5, weights='imagenet' ,include_top=False)" and I have downloaded the 'DenseNet-BC-121-32-no-top.h5', so 'model.load_weights( 'DenseNet-BC-121-32-no-top.h5')' is executed.
But the error raised as blowe :
KeyError: "Unable to open object (Object 'global_average_pooling2d_1' doesn't exist)"
Anaconda3,pythin3.6,in Win7.

Training doesn't go past 1 epoch

I am training cifar10 dataset using cifar10.py script. I am able to start the training but the training doesn't go beyond 1 epoch. I am not getting any error messages. I waited for close to one hour for the training to move forward but nothing happens.

Is it because of the different versions of Keras/Theano? I am currently using keras version 2.0.2 and theano 0.9. And I am using TitanX gpu.

Cifar 10 numbers reported use data augmentation

Could it be that the accuracy reported for the CIFAR10 dataset is obtained using data augmentation?
This is somehow misleading because the accuracy reported in the paper doesn't use it.

Question on this repo DenseNet implementation

Hi there, I am interested in using DenseNet for training my dataset to perform object classification and localization. I am wondering if this DenseNet repo allows for transfer learning and also object localization with bounding boxes?

Quite interested in DenseNet as it uses features from all layers due to its concatenations throughout the network!

Thanks guys!

Running Densnet in CPU

I ran Densenet.py for mnist dataset, But training got hanged with CPU utilization in 1st epoch itself, Kindly suggest a way to run densenet keras for mnist dataset, and advise of the procedure to run the same in GPU.

I am confused, Is this right?

def __dense_block(x, nb_layers, nb_filter, growth_rate, bottleneck=False, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True, return_concat_list=False):
    concat_axis = -1
    x_list = [x]
    for i in range(nb_layers):
        cb = __conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay)
        x_list.append(cb)
        x = concatenate([x, cb], axis=concat_axis)
        if grow_nb_filters:
            nb_filter += growth_rate

I want to know if this is a error to pass the grouth_rate to __conv_block, Why not nb_filters?

have problem to get 76% accuarcy on CIFAR10 dataset

Dear
I used following setting for my network but I can not get 76% accuracy on validation set , the optimizer is 'Adam ' and also I did mean-subtraction as preprocessing for R, G , B channel separately
.would you please help me ?

   model = Sequential()
   model.add(Conv2D(32, (3, 3), border_mode='same',batch_input_shape=(None,32,32,3)))
    
    
    model.add(Activation('tanh'))
    model.add(Conv2D(32, ( 3, 3),border_mode='same'))

    
   model.add(Activation('tanh'))
   model.add(MaxPooling2D(pool_size=(2, 2)))
   model.add(Dropout(0.25))
   model.add(Conv2D(64, (3, 3),border_mode='same')) 
   model.add(Activation('tanh'))
   model.add(Conv2D(64, ( 3, 3),border_mode='same'))
   model.add(Activation('tanh'))
   model.add(MaxPooling2D(pool_size=(2, 2)))
   model.add(Dropout(0.25))
    
   model.add(Flatten())
   model.add(Dense( 1024,init='normal'))
   model.add(Activation('tanh'))
   model.add(Dropout(0.25))
   model.add(Dense( 10,))
   model.add(Activation('softmax'))

About Densenet architecture

How is L(L+1)/2 implemented?

Pretrained Cifar10 Weights are giving ~10% test accuracy

I downloaded the DenseNet-40-12-Tensorflow-Backend-TF-dim-ordering.h5 file from the Releases page, compiled the model, loaded the weights, and tested it, but only got a 10.06% accuracy. Am I using the pretrained models wrong?

My code:

model = densenet.DenseNet(img_dim, classes=nb_classes, depth=depth, nb_dense_block=nb_dense_block,
                          growth_rate=growth_rate, nb_filter=nb_filter, dropout_rate=dropout_rate, weights=None)
optimizer = Adam(lr=1e-3) # Using Adam instead of SGD to speed up training
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=["accuracy"])
print("Finished compiling")
print("Building model...")

(trainX, trainY), (testX, testY) = cifar10.load_data()

testX = testX.astype('float32')

testX = densenet.preprocess_input(testX)

Y_test = np_utils.to_categorical(testY, nb_classes)

# Load model
weights_file="weights/DenseNet-40-12-Tensorflow-Backend-TF-dim-ordering.h5"
if os.path.exists(weights_file):
    model.load_weights(weights_file, by_name=True)
    print("Model loaded.")

yPreds = model.predict(testX)
yPred = np.argmax(yPreds, axis=1)
yTrue = testY

accuracy = metrics.accuracy_score(yTrue, yPred) * 100
error = 100 - accuracy
print("Accuracy : ", accuracy)
print("Error : ", error)

Missing last connection on FC-DenseNet?

Hi Somshubra, thanks a lot for this great implementation of FC-DenseNets.

I believe that there may be a connection missing when you use include_top though. You are passing the skip block connection to the last convolutional layer, but I think that the correct approach would be to pass the full output of the last upsampling dense block.

I forked your code and made this change for my last project and got very nice results using a FC-DenseNet56 with sigmoid activation.

Take a look and see if you agree with me, I am happy to send a pull request and make the changes if you want.

Thanks

transfer learning - load pretrained model

Hello! I am facing a problem with the pre-trained model you are sharing. Whenever i try to load the weights (either for transfer learning or prediction) i get the following error message:

Exception: Layer #0 (named "initial_conv2D" in the current model) was found to correspond to layer convolution2d_1 in the save file. However the new layer initial_conv2D expects 1 weights, but the saved weights have 2 elements.

I can't understand if i am doing something wrong or if the problem is in the pretrained model...
can you help?

Thank you in advance and thank you for your contributions!

bn relu conv bottleneck

It seems there may be another change needed for the bottleneck case based on the paper:

We find this design es- pecially effective for DenseNet and we refer to our network with such a bottleneck layer, i.e., to the BN-ReLU-Conv(1× 1)-BN-ReLU-Conv(3×3) version of Hl, as DenseNet-B.

It looks like the network here and in keras contrib doesn't do that order. I think it should be:

def __conv_block(x, nb_filter, bottleneck=False, dropout_rate=None, weight_decay=1e-4):
    '''
    Adds a convolution layer (with batch normalization and relu),
    and optionally a bottleneck layer.

    # Arguments
        x: Input tensor
        nb_filter: integer, the dimensionality of the output space
            (i.e. the number output of filters in the convolution)
        bottleneck: if True, adds a bottleneck convolution block
        dropout_rate: dropout rate
        weight_decay: weight decay factor

     # Input shape
        4D tensor with shape:
        `(samples, channels, rows, cols)` if data_format='channels_first'
        or 4D tensor with shape:
        `(samples, rows, cols, channels)` if data_format='channels_last'.

    # Output shape
        4D tensor with shape:
        `(samples, filters, new_rows, new_cols)` if data_format='channels_first'
        or 4D tensor with shape:
        `(samples, new_rows, new_cols, filters)` if data_format='channels_last'.
        `rows` and `cols` values might have changed due to stride.

    # Returns
        output tensor of block
    '''
    with K.name_scope('ConvBlock'):
        concat_axis = 1 if K.image_data_format() == 'channels_first' else -1

        if bottleneck:
            inter_channel = nb_filter * 4

            x = BatchNormalization(axis=concat_axis, epsilon=1.1e-5)(x)
            x = Activation('relu')(x)
            x = Conv2D(inter_channel, (1, 1), kernel_initializer='he_normal', padding='same', use_bias=False,
                       kernel_regularizer=l2(weight_decay))(x)

        x = BatchNormalization(axis=concat_axis, epsilon=1.1e-5)(x)
        x = Activation('relu')(x)
        x = Conv2D(nb_filter, (3, 3), kernel_initializer='he_normal', padding='same', use_bias=False)(x)
        if dropout_rate:
            x = Dropout(dropout_rate)(x)

    return x

Accuracy lower than expected

If i use the model you provided(DenseNet-40-12-Cifar10), the classification accuracy is 94.74%.
But i try to train a new model, i just uncomment "model.fit_generator(...)" line in Cifar10.py, after 200 epoches, accuracy is 88.3%(seen from the screen).
Is there something wrong or how to train a new model from original Cifar10 dataset?
Thanks!

Training my own image

How do I train my own dataset?

Where is connection to 12 layers of each dense block?

In Densenet which has 12 layers,the second layers has 11 connections.I cannot find it.Please explain.Thanks.

Should the input to the include_top ignore the input to the last denseblock?

In the densenet FCN paper, the Diagram of Figure1 show no skip connection from input to output of the last denseblock in the decompression path.
But the function '__create_fcn_dense_net' use the skip connection like:
line 770 of densenet.py
x = Conv2D(nb_classes, (1, 1), activation='linear', padding='same', use_bias=False)(x_up)

Dose I mis-understand it? Or this operation is an improved one that I miss?

I am trying to create a DenseNet model without the last layer using the instruction:
model = densenet.DenseNet((32, 32,3), depth=40, growth_rate=12, nb_filter=16,include_top=False)

but it produces the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "densenet.py", line 169, in DenseNet
    model.load_weights(weights_path)
  File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2572, in load_weights
    load_weights_from_hdf5_group(f, self.layers)
  File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/topology.py", line 3012, in load_weights_from_hdf5_group
    g = f[name]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2687)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2645)
  File "/home/joheras/.virtualenvs/keras/local/lib/python2.7/site-packages/h5py/_hl/group.py", line 166, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2687)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-YjgiH2/h5py/h5py/_objects.c:2645)
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open (/tmp/pip-build-YjgiH2/h5py/h5py/h5o.c:3440)
KeyError: "Unable to open object (Object 'dense_2' doesn't exist)"

It does not matter whether I use theano or tensorflow as backend.

Do you know how to solve this?

Thanks in advance.
Jónathan

no longer works with newest keras

Get ImportError: cannot import name _obtain_input_shape with keras 2.2.2 as opposed to 2.2.0