marvinteichmann / tensorflow-fcn Goto Github PK

An Implementation of Fully Convolutional Networks in Tensorflow.

License: MIT License

Python 100.00%

segmentation tensorflow vgg

tensorflow-fcn's Introduction

Update

An example on how to integrate this code into your own semantic segmentation pipeline can be found in my KittiSeg project repository.

tensorflow-fcn

This is a one file Tensorflow implementation of Fully Convolutional Networks in Tensorflow. The code can easily be integrated in your semantic segmentation pipeline. The network can be applied directly or finetuned to perform semantic segmentation using tensorflow training code.

Deconvolution Layers are initialized as bilinear upsampling. Conv and FCN layer weights using VGG weights. Numpy load is used to read VGG weights. No Caffe or Caffe-Tensorflow is required to run this. The .npy file for [VGG16] to be downloaded before using this needwork. You can find the file here: ftp://mi.eng.cam.ac.uk/pub/mttt2/models/vgg16.npy

No Pascal VOC finetuning was applied to the weights. The model is meant to be finetuned on your own data. The model can be applied to an image directly (see test_fcn32_vgg.py) but the result will be rather coarse.

Requirements

In addition to tensorflow the following packages are required:

numpy scipy pillow matplotlib

Those packages can be installed by running pip install -r requirements.txt or pip install numpy scipy pillow matplotlib.

Tensorflow 1.0rc

This code requires Tensorflow Version >= 1.0rc to run. If you want to use older Version you can try using commit bf9400c6303826e1c25bf09a3b032e51cef57e3b. This Commit has been tested using the pip version of 0.12, 0.11 and 0.10.

Tensorflow 1.0 comes with a large number of breaking api changes. If you are currently running an older tensorflow version, I would suggest creating a new virtualenv and install 1.0rc using:

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.0rc0-cp27-none-linux_x86_64.whl
pip install --upgrade $TF_BINARY_URL

Above commands will install the linux version with gpu support. For other versions follow the instructions here.

Usage

python test_fcn32_vgg.py to test the implementation.

Use this to build the VGG object for finetuning:

vgg = vgg16.Vgg16()
vgg.build(images, train=True, num_classes=num_classes, random_init_fc8=True)

The images is a tensor with shape [None, h, w, 3]. Where h and w can have arbitrary size.

Trick: the tensor can be a placeholder, a variable or even a constant.

Be aware, that num_classes influences the way score_fr (the original fc8 layer) is initialized. For finetuning I recommend using the option random_init_fc8=True.

Training

Example code for training can be found in the KittiSeg project repository.

Finetuning and training

For training build the graph using vgg.build(images, train=True, num_classes=num_classes) were images is q queue yielding image batches. Use a softmax_cross_entropy loss function on top of the output of vgg.up. An Implementation of the loss function can be found in loss.py.

To train the graph you need an input producer and a training script. Have a look at TensorVision to see how to build those.

I had success finetuning the network using Adam Optimizer with a learning rate of 1e-6.

Content

Currently the following Models are provided:

FCN32
FCN16
FCN8

Remark

The deconv layer of tensorflow allows to provide a shape. The crop layer of the original implementation is therefore not needed.

I have slightly altered the naming of the upscore layer.

Field of View

The receptive field (also known as or field of view) of the provided model is:

( ( ( ( ( 7 ) * 2 + 6 ) * 2 + 6 ) * 2 + 6 ) * 2 + 4 ) * 2 + 4 = 404

Predecessors

Weights were generated using Caffe to Tensorflow. The VGG implementation is based on tensorflow-vgg16 and numpy loading is based on tensorflow-vgg. You do not need any of the above cited code to run the model, not do you need caffe.

Install

Installing matplotlib from pip requires the following packages to be installed libpng-dev, libjpeg8-dev, libfreetype6-dev and pkg-config. On Debian, Linux Mint and Ubuntu Systems type:

sudo apt-get install libpng-dev libjpeg8-dev libfreetype6-dev pkg-config
pip install -r requirements.txt

TODO

Provide finetuned FCN weights.
Provide general training code

tensorflow-fcn's People

Contributors

Stargazers

Watchers

Forkers

ly37544 jianweilin little1tow leiup rollingstone dominicgwak thrasi candice-x santara codemarsyu koosyong butterflyaichinese qingzew anj1 wangzhecheng bashbeik wisdomdeng ashstuff hyzcn barongeng bigsnarfdude joseph-zhong ginobilinie brucedong linzhineng kukuruza yuxng wanjinchang liuxuvip machisuke benjamesbabala caomw lyk125 raitalaama janericlenssen marshb xavierlinnow maltebaumann wingfox guojiyao pbarker aliscifp 541435721 vyraun wangzhangup haoshuji haoruozhang angusg sc841101 chongyang915 chelovekhe dongzhuoyao brade31919 anjanameenakshikumar xielm12 vsitzmann teddybearz chunniunai220ml yiqinggit danburyai jbregli lach76 kongsea k-exp zengxiaoqing cubicasa singhranjodh lilinna1990 wangxiri wzheng1983 yranibro deepanonymous hbcbh1999 yanyuzhy ssduthb chagge strawmakiyo minodisk dengdan yozey gwnudt joestrummer82 richard-chau bin913 klqulei edwardmark wen036 rachel07 ducta-qc shraman-rc tangyuhao gxpjia lxh-123 skepsun xiaoyangzz yyuzhong iij0 bowen03 bdutta19 yheno

tensorflow-fcn's Issues

regarding using this model for my own data set with only two classes

Hi Marvin,

Thank you for sharing the code. May I ask you for some advice?

I am trying to adopting your code for one of my data set, which has about 300 images with 600*800 sizes. The masked one has about two classes, class 1 takes about 20% of the whole image. The image itself is kind of far away from the data set used to pre-train the VGG model. For instance, the bio-medical data.

There are several questions,

Can I still use the pre-trained VGG weights?
Which part of the code need I change to incorporate the two-classes case?
Since the image size is pretty big and the number of images is limited, I have been planning to conduct patch-wise training, i.e., creating sample patches from the original images. Should I heavily sample the areas corresponding to class 1? During the training process, do I have to put all the patches corresponding to a single image into a single batch? Or I can just use batch size=1 in the training process?

Wenouyang

What does the "use_dilated" mean in "build" function?

I can't understand what codes below want to do, does anybody can explain?
Thanks so much !!!

if use_dilated:
    pad = [[0, 0], [0, 0]]
    self.pool4 = tf.nn.max_pool(self.conv4_3, ksize=[1, 2, 2, 1],
                                strides=[1, 1, 1, 1],
                                padding='SAME', name='pool4')
    self.pool4 = tf.space_to_batch(self.pool4,
                                   paddings=pad, block_size=2)
else:
    self.pool4 = self._max_pool(self.conv4_3, 'pool4', debug)

ouput image doesn't make sense

Seems like annotations on output image doesn't correspond input one...? After run test_fcn32_vgg.py I got:

Test loss diverges during training nyud-fcn32s-color

Hello,

After successfully train fcn8s-atonce on my own dataset with Caffe, I wanted to familiarize with your implementation of FCN in Tensorflow. I decided to start with the training of nyud-fcn32s-color on NYUDv2 dataset (40 classes challenge) with heavy learning strategy (batch size: 1, unnormalized loss, lr: 1e-10, momentum: 0.99)

I forked your tensorflow-fcn repo. Here's mine: https://github.com/howard-mahe/tensorflow-fcn
I've made some simple modifications in order to:

create NYUDv2DataHandler.py, equivalent of nyud_layers.py in Caffe
modifly loss.py to use an unnormalized loss
implement a training script

Training goes well in the first iterations but quickly my test loss starts to diverge but the most surprising is that my test metrics (global accuracy, mean accuracy per class, mean IoU) doesn't collapse at all, but oscillate a lot.

Regarding FCN paper, I would expect the following results:

gacc: 61.8%
macc: 44.7%
mIoU: 31.6%

Can anyone have a look to my repo and let me if I made something wrong ?
This issue drove me crazy for a lot of time now.

Thanks a lot for any feedbacks.

Error in loss function

   epsilon = tf.constant(value=1e-4)     
   logits = logits + epsilon
   softmax = tf.nn.softmax(logits)

It should be
epsilon = tf.constant(value=1e-4)
softmax = tf.nn.softmax(logits) + epsilon

AttributeError: type object 'GraphKeys' has no attribute 'REGULARIZATION_LOSSES'

Hi!
I got this error.

npy file loaded
Layer name: conv1_1
Layer shape: (3, 3, 3, 64)
Traceback (most recent call last):
File "test_fcn32_vgg.py", line 24, in
vgg_fcn.build(batch_images, debug=True)
File "/home/sheldon/tensorflow-fcn-master/fcn32_vgg.py", line 74, in build
self.conv1_1 = self._conv_layer(bgr, "conv1_1")
File "/home/sheldon/tensorflow-fcn-master/fcn32_vgg.py", line 135, in _conv_layer
filt = self.get_conv_filter(name)
File "/home/sheldon/tensorflow-fcn-master/fcn32_vgg.py", line 259, in get_conv_filter
tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES,
AttributeError: type object 'GraphKeys' has no attribute 'REGULARIZATION_LOSSES'

python 2.7 tensorflow1.2
Could you give me some suggestions?

training dataset

After reading the readme file, it's still not clear to me what data set was used for training? I understand some filters are initialized from vgg16, then training is done on what dataset?

By "No Pascal VOC finetuning was applied to the weights.", I guess the training set is not Pascal VOC?

Thanks,

Why the memory of weights is 1.6G?

@MarvinTeichmann
When I save the weights of this model (FCN), the memory is 1.6GB and the .meta file is 537.7MB. But actually, the memory of the weights is about 530MB. It sames strange here?

is the vgg16.npy random weights or trained weights....?

After loading the weights still the output is random.
What should i change in the code after giving correct path of .npy file ??
Please tell me how to predict loading correct weights
my output image :-

Training code

Hi Marvin,

Can you please share the training code too.

Thanks,
Amir

A problem with fcn8_vgg.py

line 112 self.pool5 = self._max_pool(self.conv4_3, 'pool4', debug) should be self._max_pool(self.conv5_3, 'pool5', debug)

Loss error

Hi ,after building the network on vgg 8 , when I try to run the loss ,it gives me error

If I see the shape of vgg.pred_up , it is a question mark.
It shows values till vgg.score_fr

Network seems not connected in Tensorboard

Hi Marvin,

Thanks for your great code!

I tried training your FCN, but ended up finding some confused things.

The first thing is during the training, the accuracy started from a low value, for example 0.3, and then jumped to over 0.9, but in fact the final trained model (having accuracy around 0.95) does not work that well. By the way, I am doing 2 classes classification (including background).

Another thing is that in Tensorboard, the 'graph' of the network seems not connected (as the following diagram shown). And the other tags are empty, even the 'scalar' and 'histogram'. Maybe I did something improperly.

The core part of my training script is attached here (I did not touch your script fcn8_vgg.py).

train_imgs, train_labels, test_imgs, test_labels = split_dataset(imgs, labels)

x = tf.placeholder(tf.float32, [None, 64, 48, 3])
y = tf.placeholder(tf.int64, [None, 64, 48, 2])

vgg_fcn = fcn8_vgg.FCN8VGG()
with tf.name_scope("content_vgg"):
	vgg_fcn.build(x, True, 2, True, True)

cross_entropy = loss(vgg_fcn.upscore32, y, 2)
train_step = tf.train.AdamOptimizer(1e-6).minimize(cross_entropy)

correct_prediction = tf.equal(vgg_fcn.pred_up, tf.argmax(y, dimension = 3))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

saver = tf.train.Saver()

with tf.Session() as sess:
	sess.run(tf.global_variables_initializer())

	for i in range(2000):
		training_batch = get_batch(train_imgs, train_labels)
		imgs_train_batch = training_batch['X']
		labels_train_batch = training_batch['Y']

		if i % 100 == 0:
			train_accuracy = accuracy.eval(feed_dict={x: imgs_train_batch, y: labels_train_batch})
			test_accuracy = accuracy.eval(feed_dict={x: test_imgs, y: test_labels})
			print (train_accuracy)
			print('step %d | training accuracy %g | test accuracy %g' % (i, train_accuracy, test_accuracy))
			file_writer = tf.summary.FileWriter(log_dir, sess.graph)

		if i % 1000 == 0:
			save_path = saver.save(sess, bestcheck_dir+'%04d'%(test_accuracy*10000))

		train_step.run(feed_dict={x: imgs_train_batch, y: labels_train_batch})

thanks in advance,

Jay

Change loss function to use tensorflow defaults

Is there any specific reason why you chose to write a custom loss function instead of directly using this?

# mask_.shape == (batch_size, h, w, n_classes)
# y_mask_.shape == (batch_size*h*w, n_classes)
mask_ =  tf.reshape(mask_, (-1, n_classes))
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=mask_, labels=y_mask))

I am wondering if there is any added advantage by using your loss function.

logits.get_shape() cannot get shape when I train the network

I build up the train graph and want to train the network, but shape = [logits.get_shape()[0], num_classes] cannot get shape of logits in loss.py. The error is :

  File "/home/kang/Documents/work_code_PC1/CamVid_tensorflow_FCN/fcn8_vgg_train.py", line 79, in train_test
    losses = loss.loss(logits,labels,fcn_inputs.NUM_CLASSES)

  File "loss.py", line 42, in loss
    epsilon = tf.constant(value=FLAGS.epsilon, shape=shape)

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/constant_op.py", line 162, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 363, in make_tensor_proto
    shape = [int(dim) for dim in shape]

TypeError: __int__ returned non-int (type NoneType)

my code is as follows:

def train_test():
    with tf.Graph().as_default():
        global_step = tf.Variable(0, trainable=False)
        images, labels = fcn_inputs._input_pipeline(FLAGS.txtfile,batch_size = FLAGS.batch_size,
                                                    image_shape = [360,480,3],
                                                    label_tensor_shape = [360,480,fcn_inputs.NUM_CLASSES])
        vgg_fcn = fcn8_vgg.FCN8VGG()
        vgg_fcn.build(images, train=True, num_classes = fcn_inputs.NUM_CLASSES,
                      random_init_fc8 = True, debug=True)

        logits = vgg_fcn.upscore32
        print(logits.get_shape()) 

        losses = loss.loss(logits,labels,fcn_inputs.NUM_CLASSES)

def loss(logits, labels, num_classes):
    """Calculate the loss from the logits and the labels.

    Args:
      hypes: dict
          hyperparameters of the model
      logits: tensor, float - [batch_size, width, height, num_classes].
          Use vgg_fcn.up as logits.
      labels: Labels tensor, int32 - [batch_size, width, height, num_classes].
          The ground truth of your data.

    Returns:
      loss: Loss tensor of type float.
    """
    with tf.name_scope('loss'):
        logits = tf.reshape(logits, (-1, num_classes))

        shape = [logits.get_shape()[0], num_classes]
        epsilon = tf.constant(value=FLAGS.epsilon, shape=shape)

It confuses me, why cannot tensorflow get the shape of upscore32?
Thank you very much.

_summary_reshape

Can you briefly explain why you need this function and how it works? I got an error from this function when I tried to train the model with a function that I wrote myself. I used PASCAL VOC so I define num_class = 21, including background. Here is the error stack

  /home/weiliu/projects/tensorflow-fcn/fcn32_vgg.py(106)build()
    104             self.score_fr = self._fc_layer(self.fc7, "score_fr",
    105                                            num_classes=num_classes,
--> 106                                            relu=False)
    107 
    108         self.pred = tf.argmax(self.score_fr, dimension=3)

  /home/weiliu/projects/tensorflow-fcn/fcn32_vgg.py(165)_fc_layer()
    163                                 message='Shape of %s' % name,
    164                                 summarize=4, first_n=1)
--> 165             return bias
    166 
    167     def _score_layer(self, bottom, name, num_classes):

  /home/weiliu/projects/tensorflow-fcn/fcn32_vgg.py(340)get_fc_weight_reshape()
    338         if num_classes is not None:
    339             weights = self._summary_reshape(weights, shape,
--> 340                                             num_new=num_classes)
    341         init = tf.constant_initializer(value=weights,
    342                                        dtype=tf.float32)

> /home/weiliu/projects/tensorflow-fcn/fcn32_vgg.py(298)_summary_reshape()
    296             avg_idx = start_idx//n_averaged_elements
    297             avg_fweight[:, :, :, avg_idx] = np.mean(
--> 298                 fweight[:, :, :, start_idx:end_idx], axis=3)
    299         return avg_fweight
    300

And the error message is: IndexError: index 21 is out of bounds for axis 3 with size 21
Thanks for help!

Can we store the fine-tuned weight using tf.train.Saver()?

Hi, I am fine-tuning your FCN toward my own dataset. But after tuning, I was doing stuff like 'conv1_1/filter:0','conv1_1/biases:0'; sess.run(filters) to extract features and save them into npy file. But since we have lots of filters here, listing all of them would be messy.

So I was wondering if we can do it with the tf.train.Saver, which allows us to store the trained Variables fast and clean. But later I notice that there is no "Variable" in the fcn8s_vgg file. And I am thinking that by using your original fcn8s_vgg.py file, we can't use the tf.train.Saver(), right?

I am new to tensorflow, please let me know your comments, thanks.

The problem about fcn8_vgg.py

Hi Marvin,thanks for your code!
I have two problems. Can you help me ?
First,I want to know why you convert RGB to BGR in fcn8_vgg ?
second,what's the meaning of the code as follows ( in line 123) :
if use_dilated:
self.pool5 = tf.batch_to_space(self.pool5, crops=pad, block_size=2)
self.pool5 = tf.batch_to_space(self.pool5, crops=pad, block_size=2)
self.fc7 = tf.batch_to_space(self.fc7, crops=pad, block_size=2)
self.fc7 = tf.batch_to_space(self.fc7, crops=pad, block_size=2)

Hello, could you tell me how to modify to achieve the effect in the article, you can run but the result is very poor.

Possible bug in the upsampling layer?

Hello there,

I'm not sure if what I'm saying it's correct because I can not find hour the conv2d_transpose works. However, in your get_deconv_filter functions you are not filling the bilinear filters in all channels, is that intensional?

More specifically, when you do this:
for i in range(f_shape[2]):
weights[:, :, i, i] = bilinear

you are only filing "a diagonal" of filters, is that how it is suppose to be? Or should it be something like this:
for i in range(f_shape[2]):
for j in range(f_shape[3]):
weights[:, :, i, j] = bilinear

Again, I'm not sure if this was intentional because I do not know how the conv2d_transpose works.

Patrick

ValueError: No gradients provided for any variable

Hi, Marvin! I am trying to use your code to train a model. But I came up with the error below:

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'conv1_1/filter:0' shape=(3, 3, 3, 64) dtype=float32_ref>", ......,"<tf.Variable 'upscore32/up_filter:0' shape=(16, 16, 37, 37) dtype=float32_ref>"] and loss Tensor("content_vgg/loss/total_loss:0", shape=(), dtype=float32)

I know this is not a bug issue but it did take me a lot time to check where the issue lies. I am wondering if you have any idea about this case.

here is my train code

# read data, get batch image/depth/label
# tensor with dtype uint8

trainImg, trainDepth, trainLabel = Dataset.getDataName(split='train')
dataset = Dataset.create_dataset(trainImg, trainDepth, trainLabel, batchsize=32)
iterator = dataset.make_one_shot_iterator()
(img, depth, label) = iterator.get_next()

# build the network
NUM_CLASSES = 37
vgg_fcn = fcn8_vgg.FCN8VGG()
with tf.name_scope("content_vgg"):
    vgg_fcn.build(img, num_classes=NUM_CLASSES, debug=True, train=True)
    Logits = vgg_fcn.pred_up
    Loss = loss(logits=Logits, labels=label, num_classes=NUM_CLASSES)
    train_op = tf.train.AdamOptimizer(1e-3).minimize(Loss)
    print('Finished building Network.')

# start training
with tf.Session() as sess:
    logging.info("Start Initializing Variabels.")
    init = tf.global_variables_initializer()
    sess.run(init)

    print('Training the Network')
    sess.run(train_op)

I use Dataset API to read batch images.
As your original code use placeholder to read data and the placeholder(dtype:'float') will give image a Implicit conversion, I change a lit bit in fcn8_vgg.py:

        with tf.name_scope('Processing'):

            # here I cast image tensor form uint8 to float32

            rgb = tf.cast(rgb, dtype=tf.float32)

            red, green, blue = tf.split(rgb, 3, 3)
            bgr = tf.concat([
                blue - VGG_MEAN[0],
                green - VGG_MEAN[1],
                red - VGG_MEAN[2],
            ], 3)

and in loss.py:

    with tf.name_scope('loss'):

        # I cast logits to float32 beacuse the loss.py mention that logit has to be a float tensor

        logits = tf.to_float(tf.reshape(logits, (-1, num_classes)))
        # logits = tf.reshape(logits, (-1, num_classes))
        
        epsilon = tf.constant(value=1e-4)
        labels = tf.to_float(tf.reshape(labels, (-1, num_classes)))
        softmax = tf.nn.softmax(logits) + epsilon

        if head is not None:
            cross_entropy = -tf.reduce_sum(tf.multiply(labels * tf.log(softmax),
                                           head), reduction_indices=[1])
        else:
            cross_entropy = -tf.reduce_sum(
                labels * tf.log(softmax), reduction_indices=[1])

        cross_entropy_mean = tf.reduce_mean(cross_entropy,
                                            name='xentropy_mean')
        tf.add_to_collection('losses', cross_entropy_mean)

        loss = tf.add_n(tf.get_collection('losses'), name='total_loss')
    return loss

That is all I changed. My tensorflow__version__ is 1.4.0
There are wrong connections in my graph, can you give me some advice?

Error when run test_fcn32_vgg.py

Hello Marvin,

I have some trouble when I run your codes.

When I run the test_fcn32_vgg.py, it reminds me that there is no enough memory to allocate the tensor. (with gpu)
However, I can run the test_fcn16_vgg.py(with gpu), it works well.
And then I tried running with only cpu, both test_fcn32_vgg.py and test_fcn16_vgg.py works well.

I read the codes and it seems that test_fcn16_vgg.py may need more memory?

Is it just for that my memory of GPu is not enough (it is about 2GB) or something else?

I run the test code, but the result is not so good.

And you can see the part of the dog is missing, which is not like your result on github.

Thanks for your time.

Sincerely,
Shuhong

Max Tensor Size in FCN-16 and FCN-32

When running FCN-16 or FCN-32 with num_classes > ~700 or > ~350 respectively, Tensorflow complains the following and crashes:

ValueError: Cannot create a tensor proto whose content is larger than 2GB.

It fails because the deconv filter is > 2gb with this number of classes (64 x 64 x 350 x 350 in the case of FCN-32) and TF rejects tensors that big. Is there a way to reconfigure the deconv filter/layer so that it doesn't have to allocate all of this at once? My first thought would be to use a generator but tensorflow.py_func() doesn't support generator functions and I haven't found any documentation hinting about the possibility of this.

Note, I tried splitting up the deconv up into several smaller operations, but it turns out TF Graph has a limit of 2gb as well.

Edit: I was able to fix this by splitting the deconv operating into 20 smaller deconv operations, then concating results and running argmax. This produced significantly worse results than just reducing # output classes to 20 and is significantly more computationally expensive.

vgg.build() error : TypeError: build() missing 1 required positional argument: 'rgb'

test_fcn32 fails, but test_fcn8 and test_fcn16 complete

Hello,

I am testing the tensorflow-fcn implementation on a new PC, and I am unable to get test_fcn32 to complete, but able to get test_fcn8, and test_fcn16 to complete.

test_fcn32 fails after "Shape of pool5[1 12 16 512]". Do you have any advice for what may be going wrong? I wonder if it has to do with my GPU, but the GPU seems to be operating fine under fcn8 and fcn16. What do you think? Can you help?

Thanks!

2017-07-25 10:31:25.240324: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.240571: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.240779: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.241016: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.241228: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.241513: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.242272: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:25.242685: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 10:31:27.164173: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:940] Found device 0 with properties:
name: Quadro M1200
major: 5 minor: 0 memoryClockRate (GHz) 1.148
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.35GiB
2017-07-25 10:31:27.164414: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0
2017-07-25 10:31:27.164536: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0: Y
2017-07-25 10:31:27.164706: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro M1200, pci bus id: 0000:01:00.0)
npy file loaded
Layer name: conv1_1
Layer shape: (3, 3, 3, 64)
Layer name: conv1_2
Layer shape: (3, 3, 64, 64)
Layer name: conv2_1
Layer shape: (3, 3, 64, 128)
Layer name: conv2_2
Layer shape: (3, 3, 128, 128)
Layer name: conv3_1
Layer shape: (3, 3, 128, 256)
Layer name: conv3_2
Layer shape: (3, 3, 256, 256)
Layer name: conv3_3
Layer shape: (3, 3, 256, 256)
Layer name: conv4_1
Layer shape: (3, 3, 256, 512)
Layer name: conv4_2
Layer shape: (3, 3, 512, 512)
Layer name: conv4_3
Layer shape: (3, 3, 512, 512)
Layer name: conv5_1
Layer shape: (3, 3, 512, 512)
Layer name: conv5_2
Layer shape: (3, 3, 512, 512)
Layer name: conv5_3
Layer shape: (3, 3, 512, 512)
Layer name: fc6
Layer shape: [7, 7, 512, 4096]
Layer name: fc7
Layer shape: [1, 1, 4096, 4096]
Layer name: fc8
Layer shape: [1, 1, 4096, 1000]
Finished building Network.
Running the Network
2017-07-25 10:31:32.566719: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of input image: [1 368 489 3]
2017-07-25 10:31:33.329698: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of pool1[1 184 245 64]
2017-07-25 10:31:33.842707: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of pool2[1 92 123 128]
2017-07-25 10:31:34.442268: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of pool3[1 46 62 256]
2017-07-25 10:31:35.171996: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of pool4[1 23 31 512]
2017-07-25 10:31:35.372775: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\kernels\logging_ops.cc:79] Shape of pool5[1 12 16 512]
2017-07-25 10:31:38.214065: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2017-07-25 10:31:38.214361: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_timer.cc:54] Internal: error destroying CUDA event in context 0000022AC3660650: CUDA_ERROR_LAUNCH_FAILED
2017-07-25 10:31:38.214616: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_timer.cc:59] Internal: error destroying CUDA event in context 0000022AC3660650: CUDA_ERROR_LAUNCH_FAILED
2017-07-25 10:31:38.214852: F c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\stream_executor\cuda\cuda_dnn.cc:2479] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED

Process finished with exit code -1073740791 (0xC0000409)

I train datasets "ADEChallengeData2016",But loss is almost the same

I use your code to train datasets(ADEChallengeData2016),But the loss is almost the same:
epoch: 19 || batch: 92 || l: 3.161138
epoch: 19 || batch: 93 || l: 3.038326
epoch: 19 || batch: 94 || l: 3.166589
epoch: 19 || batch: 95 || l: 3.135044
epoch: 19 || batch: 96 || l: 3.399570
epoch: 19 || batch: 97 || l: 2.850001
epoch: 19 || batch: 98 || l: 3.409142
epoch: 19 || batch: 99 || l: 3.082799
epoch: 19 || batch: 100 || l: 3.092099
epoch: 19 || batch: 101 || l: 3.262898
epoch: 19 || batch: 102 || l: 3.120981
epoch: 19 || batch: 103 || l: 3.275167
epoch: 19 || batch: 104 || l: 3.187988
epoch: 19 || batch: 105 || l: 3.516507
epoch: 19 || batch: 106 || l: 2.979967
epoch: 19 || batch: 107 || l: 3.051030
epoch: 19 || batch: 108 || l: 3.491227
epoch: 19 || batch: 109 || l: 3.281879
epoch: 19 || batch: 110 || l: 3.299110
epoch: 19 || batch: 111 || l: 3.144029

My loss code is following:
loss=tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=vgg_fcn.upscore32,labels=tf.squeeze(mask,squeeze_dims=[3]),name="entropy")))
train = tf.train.AdamOptimizer(0.005).minimize(loss)
batch_size = 15

What should I do?

Model initialization from scratch

Can the fcn model be initialized without vgg16.npy? Why it requires the saved model?

the kernel–size of upscore–layer(deconvolution) in fcn32s

Hi, guy. I want to know why the kernel-size of upscore-layer(32s) is 64. Why not 404, the paper give the vgg16output’ receptive field equal to 404, and i think it is also the kernel-size of deconvolution, and i try k-size=64, but the connect between input and output(receptive filed) is wrong.
Can you tell me? thank you very much!!!

confused about loss (hypes) hyperparameters of the model

I am confused about what is hypes in loss. Where does this come from?
Thank you very much.

Contribute to tflearn

@MarvinTeichmann thank you for this implementation of FCN in tensorflow. This is exactly what I was looking for. I'd like to humbly suggest you port your work into tflearn. I think that your implementation of _upscore_layer as well as the loss function you've defined would be very useful in tflearn.

why kSize of 64 in upscore_layer

Hi @MarvinTeichmann,
I was wondering how you came about the value 64 for kSize in the upscore_layer of fcn32

Also I was wondering how you came about the bilinear value calculations for the deconv_filter?

Thanks!

Extend your get_deconv_filter from 2D to 3D. How?

I want to extend your get_deconv_filter from 2D to 3D. In CAFFE, it can be implemented as

int f = ceil(blob->shape(-1) / 2.);
    float c = (2 * f - 1 - f % 2) / (2. * f);
    for (int i = 0; i < blob->count(); ++i) {
      float x = i % blob->shape(-1);
      float y = (i / blob->shape(-1)) % blob->shape(-2);
      float z = (i/(blob->shape(-1)*blob->shape(-2))) % blob->shape(-3);
      data[i] = (1 - fabs(x / f - c)) * (1 - fabs(y / f - c)) * (1-fabs(z / f - c));

So, the get_deconv_filter function will be changed as

def get_deconv_filter(self, f_shape):
        width = f_shape[0]
        heigh = f_shape[0]
        depth = f_shape[0]
        f = ceil(width/2.0)
        c = (2 * f - 1 - f % 2) / (2.0 * f)
        bilinear = np.zeros([f_shape[0], f_shape[1],f_shape[2]])
        for x in range(width):
            for y in range(heigh):
                for z in range(depth):
                      value = (1 - abs(x / f - c)) * (1 - abs(y / f - c))*(1 - abs(z / f - c))
                      bilinear[x, y, z] = value
        weights = np.zeros(f_shape)
        for i in range(f_shape[2]):
            weights[:, :, :, i, i] = bilinear

        init = tf.constant_initializer(value=weights,
                                       dtype=tf.float32)
        return tf.get_variable(name="up_filter", initializer=init,
                               shape=weights.shape)

Could you please look at my code and correct help me if it wrong? Thanks

Implementation for fully connected layer.

Hi, I was directed to here by

https://datascience.stackexchange.com/questions/12830/how-are-1x1-convolutions-the-same-as-a-fully-connected-layer

if name == 'fc6':
    filt = self.get_fc_weight_reshape(name, [7, 7, 512, 4096])
elif name == 'score_fr':
    name = 'fc8'  # Name of score_fr layer in VGG Model
    filt = self.get_fc_weight_reshape(name, [1, 1, 4096, 1000], num_classes=num_classes)
else:
    filt = self.get_fc_weight_reshape(name, [1, 1, 4096, 4096])
    conv = tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME')

My question is on 'fc6' layer, assume at that layer, the input, which is the bottom here has shape [batch_size, 7, 7, 512], the weight matrix, which is filterhere has shape [7, 7, 512, 4096], so after tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME'), the output, conv here should have shape [batch_size, 7, 7, 4096], even it is a 1 x 1 convolution, but it is not a fully connected layer.

An example of images with shape [None, h, w, 3]

In the README you mentioned an example where images is a tensor with a shape of [None, h, w, 3] but I am not finding anywhere in the library where a tensor of this shape is defined. Also, for example in fcn8_vgg.py.build(), you have the following code for the rgb param ( image in this case ):

red, green, blue = tf.split(rgb, 3, 3)

doesn't this assume just a single image so when processing batches, this would error out ? Sorry if the question appears redundant to you but I am wondering if I am missing something here.

running error from "python test_fcn32_vgg.py"

Hi, all,

I got the following error when running "python test_fcn32_vgg.py"

InternalError (see above for traceback): Dst tensor is not initialized.
	 [[Node: fc6/weights/Initializer/Const = Const[_class=["loc:@fc6/weights"], dtype=DT_FLOAT, value=Tensor<type: float shape: [7,7,512,4096] values: [[[1.9745843e-05 0.00035308721 -0.0018327669]]]...>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

My version of tf is r0.12.1 and I checkout the package (bf9400c) already. Could someone suggest me how to fix it?

THX!

Some errors in Fuinction of FCN32_vgg

In the file FCN32_vgg.py in the function of get_deconv_filter(self, f_shape):
youa re doing
for i in range(f_shape[2]):
weights[:, :, i, i] = bilinear
line 240. if the number of input channel is different from no of classes your weight initialization will not work. If you can explain why you did weights[:,:,i,i] instead of weights[:,:,i,j] and iterating over both i and j. it would be of great help.

fcn16: ValueError: Incompatible shapes for broadcasting: (?, ?, ?, 2) and (?, ?, ?, 20)

I run:
python test_fcn16_vgg.py

And get:
Traceback (most recent call last): File "test_fcn16_vgg.py", line 36, in <module> vgg_fcn.build(batch_images, debug=True) File "/home/***/misc/tensorflow-fcn/fcn16_vgg.py", line 125, in build self.fuse_pool4 = tf.add(self.upscore2, self.score_pool4) ... ValueError: Incompatible shapes for broadcasting: (?, ?, ?, 2) and (?, ?, ?, 20)

Examination of _upscore_layer for the three network architectures shows that the function in fcn16_vgg.py includes a line at 237:

deconv.set_shape([None, None, None, 2])

which is hard coded for 2 classes. This line should be removed I assume. The code runs after I remove it.

Question: is it possible to run it on mobile device?

General training code

Hi, Is there any general training codes for FCN? I find that this project are based on tensor vision.

can't download vgg16.npy file

can't not open this website for download vgg16.npy:
ftp://mi.eng.cam.ac.uk/pub/mttt2/models/vgg16.npy

target_size=None can not work.

I found code about the input shape.

    if target_size:
        input_shape = target_size + (3,)
    else:
        input_shape = (None, None, 3)

When I set target_size=None, error encountered as below:

Traceback (most recent call last):
  File "train.py", line 132, in <module>
    data_dir, label_dir, target_size=target_size, resume_training=True)
  File "train.py", line 102, in train
    batch_size=batch_size, shuffle=True
  File "/data/kuixu/exper/fcn/Keras-FCN/utils/SegDataGenerator.py", line 186, in flow_from_directory
    save_to_dir=save_to_dir, save_prefix=save_prefix, save_format=save_format)
  File "/data/kuixu/exper/fcn/Keras-FCN/utils/SegDataGenerator.py", line 26, in __init__
    self.target_size = tuple(target_size)
TypeError: 'NoneType' object is not iterable

please help

Out of Memory if the batch size is larger than 5

I use GTX 1080 (8G). The size of images in train datasets are 640*480.
When the batch size is larger than 5, GPU will out of memory. The weight is 515MB, so is there something wrong?
Second, the training is very slow. If batch size is 2, five seconds are needed.
Do you know the reason, thanks!@MarvinTeichmann

convert the ckpt model to movidius graph

when I convert the ckpt model to movidius graph, I meet the problem:
InvalidArgumentError (see above for traceback): Number of ways to split should evenly divide the split dimension, but got split_dim 3 (size = 224) and num_split 3 [[Node: Validation/Validation/Processing/split = Split[T=DT_FLOAT, num_split=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Validation/Validation/Processing/split/split_dim, Validation/ExpandDims)]]」

then I find the code in tensorflow split_op.cc
OP_REQUIRES(context, input_shape.dim_size(split_dim) % num_split == 0, errors::InvalidArgument( "Number of ways to split should evenly divide the split " "dimension, but got split_dim ", split_dim, " (size = ", input_shape.dim_size(split_dim), ") ", "and num_split ", num_split));

I read code in KittiSeg/submodules/tensorflow-fcn/fcn8_vgg.py , the following code execute "tf.split()" function.


         red, green, blue = tf.split(rgb, 3, 3)

        # assert red.get_shape().as_list()[1:] == [224, 224, 1]
        # assert green.get_shape().as_list()[1:] == [224, 224, 1]
        # assert blue.get_shape().as_list()[1:] == [224, 224, 1]
        bgr = tf.concat([
            blue - VGG_MEAN[0],
            green - VGG_MEAN[1],
            red - VGG_MEAN[2],
        ], 3)

this code means the image channels which divided by 3, but the log shows 224 divided 3 , so 224%3 != 0.

A problem about the deconvolution

Hi Marvin,thanks for your brilliant code.I have learned a lot of things.
But I have a question about the implementation of the deconvolution.
First,it seem that the input channel has to be equal to output.I wanna know why you don‘t use the tf.image.resize_bilinear().I have read the offical code by Caffe, the learning rate of their deconvolution layer is zero.
Second,I can not understand how you define the deconvolution filters(as to the way you construct the function:get_deconv_filter).Can you tell me your thoughts or give me some papers or blogs i can refer to ?
Hope for your reply.Thanks @MarvinTeichmann

Memory error

dear sir,
I faced this question that really difficult to me .
can you help me
thanks

questions bellow:
Traceback (most recent call last):
File "test_fcn32_vgg.py", line 22, in
vgg_fcn = fcn32_vgg.FCN32VGG()
File "/home/zty/文档/tensorflow-fcn-master/fcn32_vgg.py", line 33, in init
self.data_dict = np.load(vgg16_npy_path, encoding='latin1').item()
File "/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 419, in load
pickle_kwargs=pickle_kwargs)
File "/usr/local/lib/python2.7/dist-packages/numpy/lib/format.py", line 640, in read_array
array = pickle.load(fp, **pickle_kwargs)
MemoryError

train just for segmentation

Dear Marvin
thanks for your great code!
I want just to use this code for image segmentation and cant download a bulky data to find out how to use it.
please and please provide step by step guide to train this code on new image dataset for segmentation.
training and testing.
its required for me urgently so I will refer your works.
please inform me.
[email protected]

invalid syntax cross_entropy = -tf.reduce_sum(labels * tf.log(softmax), reduction_indices=[1]))

I think in the line 47 of loss.py. there is an invalid syntax mistake:-tf.reduce_sum(labels * tf.log(softmax), reduction_indices=[1])) the last bracket should be removed.

A problem about fcn8_vgg.py

When I ran fcn8_vgg.py,it throw an exception：
2017-09-06 19:42:22.785673: W tensorflow/core/framework/op_kernel.cc:1152] Invalid argument: Conv2DSlowBackpropInput: Size of out_backprop doesn't match computed: actual = 23, computed = 12
2017-09-06 19:42:22.785941: W tensorflow/core/framework/op_kernel.cc:1152] Invalid argument: Conv2DSlowBackpropInput: Size of out_backprop doesn't match computed: actual = 23, computed = 12
[[Node: content_vgg/upscore2/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](content_vgg/upscore2/stack, upscore2/up_filter/read, content_vgg/score_fr/BiasAdd)]]
......
InvalidArgumentError (see above for traceback): Conv2DSlowBackpropInput: Size of out_backprop doesn't match computed: actual = 23, computed = 12
[[Node: content_vgg/upscore2/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](content_vgg/upscore2/stack, upscore2/up_filter/read, content_vgg/score_fr/BiasAdd)]]

Problems with gray images

Hi Mr.Marvin Teichmann
I found this in the usage that:
The images is a tensor with shape [None, h, w, 3]. Where h and w can have arbitrary size.

My training data is gray images with no rgb channels.
Can I still use this code and vgg to train them?

training dataset

Hi , what is the format of the data for training?
I have a folder of images to train. How do I pass it through training?

marvinteichmann / tensorflow-fcn Goto Github PK

tensorflow-fcn's Introduction

Update

tensorflow-fcn

Requirements

Tensorflow 1.0rc

Usage

Training

Finetuning and training

Content

Remark

Field of View

Predecessors

Install

TODO

tensorflow-fcn's People

Contributors

Stargazers

Watchers

Forkers

tensorflow-fcn's Issues

Recommend Projects

Recommend Topics

Recommend Org