qubvel / segmentation_models Goto Github PK

View Code? Open in Web Editor NEW

4.6K 92.0 1.0K 1.82 MB

Segmentation models with pretrained backbones. Keras and TensorFlow Keras.

License: MIT License

Python 100.00%

unet fpn segmentation keras pretrained pre-trained image-segmentation linknet pspnet tensorflow

segmentation_models's Introduction

Python library with Neural Networks for Image Segmentation based on Keras and TensorFlow.

The main features of this library are:

High level API (just two lines of code to create model for segmentation)
4 models architectures for binary and multi-class image segmentation (including legendary Unet)
25 available backbones for each architecture
All backbones have pre-trained weights for faster and better convergence
Helpful segmentation losses (Jaccard, Dice, Focal) and metrics (IoU, F-score)

Important note

Some models of version 1.* are not compatible with previously trained models, if you have such models and want to load them - roll back with:

$ pip install -U segmentation-models==0.2.1

Quick start

Simple training pipeline

Examples

Models and Backbones

Installation

Documentation

Change log

Citing

License

Quick start

Library is build to work together with Keras and TensorFlow Keras frameworks

import segmentation_models as sm
# Segmentation Models: using `keras` framework.

By default it tries to import keras, if it is not installed, it will try to start with tensorflow.keras framework. There are several ways to choose framework:

Provide environment variable SM_FRAMEWORK=keras / SM_FRAMEWORK=tf.keras before import segmentation_models
Change framework sm.set_framework('keras') / sm.set_framework('tf.keras')

You can also specify what kind of image_data_format to use, segmentation-models works with both: channels_last and channels_first. This can be useful for further model conversion to Nvidia TensorRT format or optimizing model for cpu/gpu computations.

import keras
# or from tensorflow import keras

keras.backend.set_image_data_format('channels_last')
# or keras.backend.set_image_data_format('channels_first')

Created segmentation model is just an instance of Keras Model, which can be build as easy as:

model = sm.Unet()

Depending on the task, you can change the network architecture by choosing backbones with fewer or more parameters and use pretrainded weights to initialize it:

model = sm.Unet('resnet34', encoder_weights='imagenet')

Change number of output classes in the model (choose your case):

# binary segmentation (this parameters are default when you call Unet('resnet34')
model = sm.Unet('resnet34', classes=1, activation='sigmoid')

# multiclass segmentation with non overlapping class masks (your classes + background)
model = sm.Unet('resnet34', classes=3, activation='softmax')

# multiclass segmentation with independent overlapping/non-overlapping class masks
model = sm.Unet('resnet34', classes=3, activation='sigmoid')

Change input shape of the model:

# if you set input channels not equal to 3, you have to set encoder_weights=None
# how to handle such case with encoder_weights='imagenet' described in docs
model = Unet('resnet34', input_shape=(None, None, 6), encoder_weights=None)

Simple training pipeline

import segmentation_models as sm

BACKBONE = 'resnet34'
preprocess_input = sm.get_preprocessing(BACKBONE)

# load your data
x_train, y_train, x_val, y_val = load_data(...)

# preprocess input
x_train = preprocess_input(x_train)
x_val = preprocess_input(x_val)

# define model
model = sm.Unet(BACKBONE, encoder_weights='imagenet')
model.compile(
    'Adam',
    loss=sm.losses.bce_jaccard_loss,
    metrics=[sm.metrics.iou_score],
)

# fit model
# if you use data generator use model.fit_generator(...) instead of model.fit(...)
# more about `fit_generator` here: https://keras.io/models/sequential/#fit_generator
model.fit(
   x=x_train,
   y=y_train,
   batch_size=16,
   epochs=100,
   validation_data=(x_val, y_val),
)

Same manipulations can be done with Linknet, PSPNet and FPN. For more detailed information about models API and use cases Read the Docs.

Examples

Models training examples:

[Jupyter Notebook] Binary segmentation (cars) on CamVid dataset here.
[Jupyter Notebook] Multi-class segmentation (cars, pedestrians) on CamVid dataset here.

Models and Backbones

Models

Unet	Linknet

PSPNet	FPN

Backbones

Type	Names
VGG	`'vgg16' 'vgg19'`
ResNet	`'resnet18' 'resnet34' 'resnet50' 'resnet101' 'resnet152'`
SE-ResNet	`'seresnet18' 'seresnet34' 'seresnet50' 'seresnet101' 'seresnet152'`
ResNeXt	`'resnext50' 'resnext101'`
SE-ResNeXt	`'seresnext50' 'seresnext101'`
SENet154	`'senet154'`
DenseNet	`'densenet121' 'densenet169' 'densenet201'`
Inception	`'inceptionv3' 'inceptionresnetv2'`
MobileNet	`'mobilenet' 'mobilenetv2'`
EfficientNet	`'efficientnetb0' 'efficientnetb1' 'efficientnetb2' 'efficientnetb3' 'efficientnetb4' 'efficientnetb5' efficientnetb6' efficientnetb7'`

Installation

Requirements

python 3
keras >= 2.2.0 or tensorflow >= 1.13
keras-applications >= 1.0.7, <=1.0.8
image-classifiers == 1.0.*
efficientnet == 1.0.*

PyPI stable package

$ pip install -U segmentation-models

PyPI latest package

$ pip install -U --pre segmentation-models

Source latest version

$ pip install git+https://github.com/qubvel/segmentation_models

Documentation

Latest documentation is avaliable on Read the Docs

Change Log

To see important changes between versions look at CHANGELOG.md

Citing

@misc{Yakubovskiy:2019,
  Author = {Pavel Iakubovskii},
  Title = {Segmentation Models},
  Year = {2019},
  Publisher = {GitHub},
  Journal = {GitHub repository},
  Howpublished = {\url{https://github.com/qubvel/segmentation_models}}
}

License

Project is distributed under MIT Licence.

segmentation_models's People

Contributors

Stargazers

Watchers

Forkers

y4tk38 lisburnlad hajungong007 sreegowthamj danil328 nikhilroxtomar jdc08161063 guanlongtianzi luwill qiaolin1992 dgreyling rnv93 sintekllc dagron meelement hunglethanh9 venheads zlyin zacharyclam lochappy spyderxu kristin-zlchen rongchangzhao szywind gq124 juanlp guitarmind ythwilam fendaq stephenlee86 aihill xjtueducation gazay zxhuang97 donbobka dasha-5555-5 cxz yakinrubaiat baifengbai stillwaterman ymittal23 libo9562 aaxwaz lonestar686 zt706 roysh zbxzc35 x2ss mateusz93 yangsenwxy hellogiantman1989 bboyhanat smiletat arunkumarramanan zymale muween emrecanaltinsoy syzlhh ftqftq garthwasson zumbalamambo zhuty94 dlwbm123 sulingling123 edwardyangxin super973 aascode akashsengupta1997 ilyaovodov pierrepater phamcuong92 patrickhuhal jhilbertxtu maschulz mickolka vladostan hinata96 tyler-d ildarin22 tranhoangkhuongvn sainiudit atanas1054 bohaohuang miaecle blaxe05 19ai jsyzeng xuzhengchao9 tsok-xyz ncc-dev luccauchon elavin11 zfturbo fitrialif c-men valleyzine sawyer117 kamruleee51 gagolucasm sekhar989

segmentation_models's Issues

Multiclass segmentation

Firstly thanks a lot for the great repo!
` I am trying to train my data to segment multiple classes and I have followed the code sample given:

# prepare data
preprocessing_fn = get_preprocessing('resnet34')
x = preprocessing_fn(x)

# prepare model
model = Unet(backbone_name='resnet34', encoder_weights='imagenet')
model.compile('Adam', 'binary_crossentropy', ['binary_accuracy'])

# train model
model.fit(x, y, epochs=100, batch_size=1)
model.save('trained_mask.h5')`

My x is a list of (1, 256, 512, 3) (i.e. is an image of dimension 256x512) and my y is my segmentation mask. Currently I have successfully trained when the mask size is (1, 256, 512, 1), however I would like to adapt this code for multi-class-segmentation. Would that be possible? If yes, how?

I have tried two different options so far:

defined the label mask as a one-hot vector so that the y has now size (1, 256, 512, 13) (where 13 is the total number of classes considered to label).
- This however fails with this error:
  ValueError: Error when checking target: expected sigmoid to have shape (None, None, 1) but got array with shape (256, 512, 13)
so then given the error found I tried (probably naively) to use (1, 256, 512, 1) however setting each pixel of the mask with a number from 0 to 13 where 0 stands for unlabelled pixel and any other number is the desired class label.
- Despite this trains successfully, the result of testing is still a mask bounded to values between 0 and 1, so it does not allow multiple-class segmentation.

Could anyone show me what I am doing wrong and how I could solve the multi-class problem?

Thank you!

Second run getting low "not reasonable" results

Hi, I ran the model with Resnext50, got a normal result, everything was great.
The second time, I ran the model with Resnext50 (even with exactly the same hyperparameters) I'm getting a very low result, I mean my dice_coef is 0.042 after 20 epochs, in the first run it was 0.3.
It happened also with resnet50 and resnet34.

Someone have any Idea what's wrong and what am I missing?!

Thanks!

Running on a data without test labels

Hello there

Suppose I have trained my model on a dataset having the images as well as masks in the training as well as test set (Cityscape dataset). Is it possible that after training it on the (cityscape) dataset, I run it for creating masks of MSMT17_V2 image dataset? If yes what all changes should I do? Thank you well in advance for the reply :-)

PSPNet final conv

Hi qubvel,

are you sure that the final conv layer of your PSPNet implementation is correct?

model = PSPNet(backbone_name = 'resnet18',
               final_interpolation = 'bilinear',
               encoder_weights = 'imagenet',
               freeze_encoder = False,
               input_shape = (384, 384, 3),
               classes = 1,
               activation = 'sigmoid',
               use_batchnorm = True,
               downsample_factor = 8, 
               psp_pooling_type = 'avg',
               psp_conv_filters = 256,
               dropout = 0.5)

yields for the last couple of layers:

concatenate_1 (Concatenate) (None, 48, 48, 1152) 0 stage3_unit1_relu1[0][0]
resize_image_1[0][0]
resize_image_2[0][0]
resize_image_3[0][0]
resize_image_4[0][0]

conv_block_conv (Conv2D) (None, 48, 48, 512) 589824 concatenate_1[0][0]

conv_block_bn (BatchNormalizati (None, 48, 48, 512) 2048 conv_block_conv[0][0]

conv_block_relu (Activation) (None, 48, 48, 512) 0 conv_block_bn[0][0]

spatial_dropout2d_1 (SpatialDro (None, 48, 48, 512) 0 conv_block_relu[0][0]

final_conv (Conv2D) (None, 48, 48, 1) 4609 spatial_dropout2d_1[0][0]

resize_image_5 (ResizeImage) (None, 384, 384, 1) 0 final_conv[0][0]

sigmoid (Activation) (None, 384, 384, 1) 0 resize_image_5[0][0]

conv_block_conv has 589.824 params, I guess these originate from 512 1 * 1 * 1152 convolutions.
And final_conv (Conv2D) has 4609 params, likely constructed from 1 3 * 3 * 512 conv.
Shouldnt it be the other way around? conv_block_conv having 3 * 3 convs and the final layer having 1 * 1 convs? Sorry if I am mistaken here.

UResNet Error: ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis.

I am trying to create an Unet with the following code:

img_size_ori = 101
img_size_target = 197
model = UResNet34((img_size_target, img_size_target, 3))

But then I got this error message:

Traceback (most recent call last):
  File ".\main_unet.py", line 24, in <module>
    model = UResNet34((img_size_target, img_size_target, 3))
  File "F:\Workspace\kaggle\Salt Identification\model\segmentation_models\unet\models.py", line 40, in UResNet34
    activation=activation, **kwargs)
  File "F:\Workspace\kaggle\Salt Identification\model\segmentation_models\unet\builder.py", line 38, in build_unet
    x = up_block(filters, i, upsample_rate=up_size, skip=skip, **kwargs)(x)
  File "F:\Workspace\kaggle\Salt Identification\model\segmentation_models\unet\blocks.py", line 27, in layer
    x = Concatenate()([x, skip])
  File "F:\Anaconda\lib\site-packages\keras\engine\base_layer.py", line 431, in __call__
    self.build(unpack_singleton(input_shapes))
  File "F:\Anaconda\lib\site-packages\keras\layers\merge.py", line 354, in build
    'Got inputs shapes: %s' % (input_shape))
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 14, 14, 512), (None, 13, 13, 256)]

Thanks for this nice repository.

Implement Y-net architecture

Thanks for this great Python library! It would be nice to have support for Y-net, as it seems to achieve better precision and recall than U-net in some cases.

https://arxiv.org/pdf/1806.01313.pdf

Allow for other input depths than 3

Hi, this is a feature request:
I'm interested in using a different number of input channels, let's say 5. This could be achieved by reshaping using a 1x1 convolution or by adapting the number of filters in the first convolution. In the latter case there would be no pretrained weights available or one would have to copy the weights from one of the pretrained filters to the added filters for later fine-tuning, which might even be preferable.

Kind regards and a happy new year!

Pretrained weights

Has anyone done the transfer of learning from the original PSPNet (in Caffe) code to keras model this repository?

ImportError: cannot import name 'Resnet18'

I get the listed error below when I try to run your demo code:

from segmentation_models import Unet
# prepare model
model = Unet(backbone_name='resnet34', encoder_weigths='imagenet')
model.compile('Adam', 'binary_crossentropy', ['binary_accuracy'])

TypeError                                 Traceback (most recent call last)
<ipython-input-12-b1cbdbcf0c01> in <module>()
      1 from segmentation_models import Unet
      2 # prepare model
----> 3 model = Unet(backbone_name='resnet34', encoder_weigths='imagenet')
      4 model.compile('Adam', 'binary_crossentropy', ['binary_accuracy'])

TypeError: Unet() got an unexpected keyword argument 'encoder_weigths'

ModuleNotFoundError: No module named 'classification_models'

Just cloned.

[Not issue] SEresnet unet

Please add pretrained seresnet unet segmentation models

FailedPreconditionError: Attempting to use uninitialized value

I'm trying to use a resnet-50 Unet with encoder weights from imagenet.

But I'm getting the next error:

FailedPreconditionError: Attempting to use uninitialized value decoder_stage0_bn1_5/moving_mean/biased
[[Node: decoder_stage0_bn1_5/moving_mean/biased/read = IdentityT=DT_FLOAT, _class=["loc:@decoder_stage0_bn1_5/AssignMovingAvg"], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

I tried both encoder_weights='imagenet' and encoder_weights=None.

IMAGE_SIZE=224
model = Unet(backbone_name=BACKBONE_NAME, encoder_weights='imagenet',
                 freeze_encoder=True,
                 input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
model.compile('Adam', 'binary_crossentropy', ['binary_accuracy'])

    ...

history0 = model.fit_generator(train_gen,
                                  callbacks=[learning_rate, save_checkpoints],
                                  epochs=2,
                                  shuffle=True)

It seems like if the the Unet expansive path hadn't been initialized.

How could I solve this?

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,160,160,64] vs. [16,159,159,64]

Thanks for making this great library. I'm trying to insert your Unet model into my own tensorflow training apparatus (generator, optimizer, loss function etc)...also within a Python 2.7 project (!)...and I get the following error:

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,160,160,64] vs. [16,159,159,64]
	 [[Node: optimization_1/gradients/outputs/link-resnet18/add_12/add_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](optimization_1/gradients/outputs/link-resnet18/add_12/add_grad/Shape, optimization_1/gradients/outputs/link-resnet18/zero_padding2d_2/Pad_grad/Shape)]]

My input image size is 256x256. I've tried changing this to 200x200 and get similiar off-by-one error.

Any idea why this might be happening? Is there a specific input image size I must use?

Custom loss function explodes and models won't learn

Thanks for the great repository and work, this is so helpful!

Recently I've trained a TensorFlow-based ResNet101 on Pascal VOC 2012 dataset. Its particularity is that there's 21 plus one class used to denote "ambiguous" regions, meant to be ignored when computing the loss and metrics. Pixels marked as ambiguous have value 255.

For this purpose, I have defined a custom loss function that I'm successfully using in the TF-based version, as follow:

def restricted_categorical_crossentropy(y_true, y_pred, class_labels):
    def get_valid_probabilities_and_labels(annotation_batch_tensor,
                                    logits_batch_tensor,
                                    class_labels):
        valid_labels_mask = tf.not_equal(annotation_batch_tensor,
                                         mask_out_class_label) # in this case mask_out_class_label is equal to 255.
        valid_labels_indices = tf.where(valid_labels_mask)
        valid_batch_indices = tf.to_int32(valid_labels_indices)

        valid_labels_batch_tensor = tf.gather_nd(params=annotation_batch_tensor, indices=valid_batch_indices)
        valid_probabilities_batch_tensor = tf.gather_nd(params=logits_batch_tensor, indices=valid_batch_indices)

        return valid_labels_batch_tensor, valid_probabilities_batch_tensor

    valid_labels, valid_predictions = get_valid_probabilities_and_labels(
        annotation_batch_tensor=y_true,
        logits_batch_tensor=y_pred,
        class_labels=class_labels)

    cross_entropy = K.categorical_crossentropy(valid_labels, valid_predictions)

    # The number of elements is different during each step due to mask out regions: normalize entropy.
    return K.mean(cross_entropy)

When I compile any of this repo's Keras models using Keras' native K.categorical_crossentropy, I get fairly decent values starting at around 2.5.

However, when I compile it using my custom loss function, the values explode to things like 11563889.6988.

How does this make any sense granted

my custom function uses K.categorical_crossentropy on a subset of pixels of the same batches;
that exact same function works well when I'm not using Keras (I get values close to 2.5 and lower)?

Of course I'm using the exact same preprocessing functions and the batching process is literally copy/pasted so I doubt it's coming from the data.

For both loss functions (K.categorical_crossentropy or restricted_categorical_crossentropy) none of the models I've tried compiling from this repo have been able to learn anything worthwhile when my TF-based implementation could. The accuracy increases to about 84% but the mean IoU is ridiculously low (around 4% which for 21 classes is nothing more than random assignment.) In comparison, my TF-based model could reach around 55% mIoU, when also pretrained on ImageNet.

Do you have any thought on where my mistake(s) could be by any chance @qubvel ?

Thanks in advance for your kind help!

Ignore labels in semantic segmentation

Is it possible to specify the ignore label for semantic segmentation models?

The purpose is to not use the ignore label in calculating the loss.

Use trained classification model as encoder

I would like to ask about using classification model trained on my data (not trained on imagenet) as encoder in any segmentation architecture.

About influence of pretrained weights

hello qubvel, Is the classification model weights trained on Imagenet helpful to the effect of image segmentation? Have you done any related experiments? If so, could you please show some experimental results? At last, do you think the weights of pre-trained backbone are helpful to the segmentation of remote sensing images? I am looking forward to your reply.

Data preparation for pretrained models

Great repository!!
I would like to ask is there is a special image preprocessing you used for training on Imagenet. For example: standard image size (eg. 224x224 ), scaling to [-1, 1], or RGB to BGR conversion.

Thank you very much.

Adding scSE blocks to the architecture.

Hi,

How can I modify the architecture that you have provided? I'd like to add scSE blocks as well to the encoder/decoder. I know I can change the input and output using Keras Model class and using Model(inp=x,out=y) but how do I change the layers in-between?

About freeze_model

Hello, I a question about the function 'freeze_model'. I cannot find this function in utils module，I don't know where it is? Could you please provide the code of 'freeze_model' or the location, Thank you

segmentation on my data?

Hi ,
I have echo-cardiograph data and labeled files of the data (.json and .csv). Can I use these models for segmentation of my data? Is there any document which explains how I can use this on my own data?

NCHW supports

Hi @qubvel, I've built up a pipeline from training to integrating the trained model to TensorRT. And in my situation, an UNet + ResNet34 inference performance boost from 54ms/frame to 7ms/frame on P40 GPU. But to reach that point, I have to convert model of NHWC to NCHW cause there exists incompatibility between NHWC model and TensorRT parser.

So I'm wondering if it is possible to add NHWC support here and so on so that the model can be easily leveraged by inference framework TensorRT.

ResourceExhaustedError

I feed 30 pictures with 480*528 size to the PSPNet and my GPU is GTX1080ti. However it returned the error as follow:

2018-10-18 00:33:05.626813: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 11 Chunks of size 152064000 totalling 1.56GiB
2018-10-18 00:33:05.626816: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 5 Chunks of size 243302400 totalling 1.13GiB
2018-10-18 00:33:05.626819: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 266727424 totalling 254.37MiB
2018-10-18 00:33:05.626823: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 334904064 totalling 319.39MiB
2018-10-18 00:33:05.626826: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 3 Chunks of size 364953600 totalling 1.02GiB
2018-10-18 00:33:05.626829: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 3 Chunks of size 486604800 totalling 1.36GiB
2018-10-18 00:33:05.626832: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 9.83GiB
2018-10-18 00:33:05.626837: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats: 
Limit:                 10586741146
InUse:                 10556298496
MaxInUse:              10556328192
NumAllocs:                   10164
MaxAllocSize:           3721396224
2018-10-18 00:33:05.626970: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ***************************************************************************************************x
2018-10-18 00:33:05.626989: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at conv_ops.cc:693 : Resource exhausted: OOM when allocating tensor with shape[30,320,60,66] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/home/public/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-4e46772605b9>", line 1, in <module>
    model.fit(X, Y, epochs=2)
  File "/home/public/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1039, in fit
    validation_steps=validation_steps)
  File "/home/public/anaconda3/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 199, in fit_loop
    outs = f(ins_batch)
  File "/home/public/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/public/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/public/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1454, in __call__
    self._session._session, self._handle, args, status, None)
  File "/home/public/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[30,320,60,66] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: block35_7_conv/convolution = Conv2D[T=DT_FLOAT, _class=["loc:@train...kpropInput"], data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](block35_7_mixed/concat, block35_7_conv/kernel/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
	 [[Node: loss_1/mul/_8545 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_22137_loss_1/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

How can I solve this problem? Thank you!

Data loading

Hi,

Thanks for developing such clean and modular code!
I'm looking at training on my dataset and I am not sure how load_data(...) is supposed to be implemented. Do you have any example?

How to handle this warning???

Could you please look at this warming and suggest what could be done in order to use the keras version of the pre-trained ResNeXt models. Since you model and the keras model are not at all compatible.

""Current ResNext models are deprecated, use keras.applications ResNeXt models""

Thanking You

Saving segmented images

Hello,

How to save segmentation result for validation images? Or evaluate on a given dataset?

FPN not working

Hi,

With the exact same inputs and backbone, this code is working for Unet but not for FPN :

model = Unet(backbone_name=backbone, classes=1, encoder_weights='imagenet')
#model = FPN(backbone_name=backbone, classes=1, encoder_weights='imagenet')
model.compile('Adam', 'binary_crossentropy', ['binary_accuracy'])
model.fit(x, y, epochs=10, batch_size=1)

For FPN :
Epoch 1/10
100/100 [==============================] - 21s 213ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 2/10
100/100 [==============================] - 14s 143ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 3/10
100/100 [==============================] - 14s 143ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 4/10
100/100 [==============================] - 14s 143ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 5/10
100/100 [==============================] - 14s 143ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 6/10
100/100 [==============================] - 14s 144ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 7/10
100/100 [==============================] - 14s 144ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 8/10
100/100 [==============================] - 14s 144ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 9/10
100/100 [==============================] - 14s 144ms/step - loss: 15.6301 - binary_accuracy: 0.0196
Epoch 10/10
100/100 [==============================] - 14s 144ms/step - loss: 15.6301 - binary_accuracy: 0.0196

For Unet:
Epoch 1/10
100/100 [==============================] - 21s 214ms/step - loss: 0.1810 - binary_accuracy: 0.9673
Epoch 2/10
100/100 [==============================] - 14s 135ms/step - loss: 0.0266 - binary_accuracy: 0.9955
Epoch 3/10
100/100 [==============================] - 14s 135ms/step - loss: 0.0172 - binary_accuracy: 0.9957
Epoch 4/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0141 - binary_accuracy: 0.9958
Epoch 5/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0136 - binary_accuracy: 0.9956
Epoch 6/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0165 - binary_accuracy: 0.9942
Epoch 7/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0123 - binary_accuracy: 0.9959
Epoch 8/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0090 - binary_accuracy: 0.9970
Epoch 9/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0081 - binary_accuracy: 0.9973
Epoch 10/10
100/100 [==============================] - 14s 136ms/step - loss: 0.0075 - binary_accuracy: 0.9975

Is there anything broken?

Thanks for this repo!

About layer regularization.

I'm curious why there is no regularizer option for layer? Is that a trick that training without regularization ?

How can I use the mode pspnet?

I executed the code:

from segmentation_models import pspnet
model = pspnet(backbone_name='inceptionresnetv2', freeze_encoder=True, input_shape=(None, None, 3))

But the console returned the error as follow:

Traceback (most recent call last):
  File "/home/[UserName]/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-4764359b8965>", line 1, in <module>
    model = pspnet(freeze_encoder=True, input_shape=(None, None, 3))
TypeError: 'module' object is not callable

What would be the problem?

Python 3.5 support?

Thanks for this great repository.

Why is it rated for Python 3.6 and higher only? The official Tensorflow-Docker-Container use Python 3.5. What do I need to change to make it compatible?

Improve readme

Hello,
I consider myself an ML Beginner, but I've been exposed to many ideas. I think the readme would be much better if there was a paragraph explaining some basic things about this package.

We can distribute pretrained model weights for segmentation models same as for object detection tasks with imagenet weights. link to transfer learning blogpost.
What considerations might go into reusing weights? would weights trained on astronomical data work for cell segmentation data? How about for radio signals in noise?
Explain term Backbone model, how can we use imagenet weights as the backbone for a segmentation task? why does this work?
Explain term Preprocessing, keras.io docs show nothing for bcg or ka*.

to a lesser extent, give the Available Models section more space in the readme. At first glance it seemed like the backbones are the primary concern when using this package, but the primary concern should probably be the toplevel architecture to try out (not the backbones).

If you approve I could draft up such a paragraph if you answer these questions.

Freeze Encoder skipping an iteration on GPU

When i use Unet with resnet 152 with image net weights and encoder freezed, the training on cpu works as intened, but on the gpu the training runs every other itreation, as in 1, 3, 5, 7...it is skipping even iterations.
Even though the training looks to be going well giving close to 98% acc on binary cross entropy, the predictions are all in 3X10-8 range, basically the predicted images are all 0.
This issue dosen't happen if freeze encoder is not used

Will some models of 3D be provided?

Make this repo pip installable

For that, add a setup.py file.

Here is a link to a great guid on how to do it: https://github.com/kennethreitz/setup.py.

I can make a PR if you don't have time. Let me know and thanks again for creating this repo. 👍

cuDNN launch failure for Resnet Encoders

I have the following error message when running segmentation model with resnet as encoder:
InternalError (see above for traceback): cuDNN launch failure : input shape ([5,3,576,576])
[[node bn_data/FusedBatchNorm (defined at /home/lab/anaconda3/envs/tf-aml/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1806) = FusedBatchNorm[T=DT_FLOAT, _class=["loc:@training/Adam/gradients/AddN_249"], data_format="NCHW", epsilon=2e-05, is_training=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/bn_data/FusedBatchNorm_grad/FusedBatchNormGrad-1-TransposeNHWCToNCHW-LayoutOptimizer, bn_data/Const_3, bn_data/beta/read, decoder_stage4_bn2/Const_4, decoder_stage4_bn2/Const_4)]]
[[{{node loss/mul/_3415}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_22986_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

It seems this is related with cuDNN and fused batch norm. I've searched on the internet about the related topics and updated my drivers but still cannot resolve this. My system information is shown below:
GPU: RTX 2080Ti
Driver: 410.57
CUDA: 9.0
cuDNN: 7.4.2
Keras: 2.1.6 (I down-grade to this version because the newer version seems causing problems for batch norm)
Tensorflow: 1.12

Problem with keras >= 2.2.0

starting from keras 2.2.0 keras.applications module moved to separate repo. need to fix imports.

IoU values coming out as negative or > 1

How can this happen? Loss is also negative, model always converges to predicting all class 0

`genargs = dict(
brightness_range=[0.5, 2],
rotation_range=180,
zoom_range=[0.3,2],
horizontal_flip=True,
vertical_flip=True
)

xgen = ImageDataGenerator(**genargs)
ygen = ImageDataGenerator(**genargs)

compute quantities required for featurewise normalization

(std, mean, and principal components if ZCA whitening is applied)

seed = 1
xgen.fit(x_train, augment=True, seed = seed)
ygen.fit(y_train, augment=True, seed = seed)

epochs = 1000
saves = 10

fits the model on batches with real-time data augmentation:

###seg model

BACKBONE = 'inceptionv3'
x_train=sm.backbones.get_preprocessing(BACKBONE)(x_train)
#need to get them to 528 x 672

x_train = np.pad(x_train, ((0,0), (0,16),(0,32),(0,0)), 'mean')
x_train = np.concatenate((x_train,x_train,x_train), axis = 3)

add the padding as background class

y_train = np.stack([np.pad(y_train[:,:,:,0], ((0,0), (0,16),(0,32)), 'constant', constant_values=1),
np.pad(y_train[:,:,:,1], ((0,0), (0,16),(0,32)), 'constant', constant_values=0),
np.pad(y_train[:,:,:,2], ((0,0), (0,16),(0,32)), 'constant', constant_values=0)], axis = 3)

model = sm.PSPNet(BACKBONE, input_shape=(528,672,3), classes=3, encoder_weights=None, psp_use_batchnorm=False)
model.compile(keras.optimizers.Adam(), #lr=1e-4, epsilon=1e-8, decay=1e-6
loss=sm.losses.jaccard_loss, metrics=[sm.metrics.f_score])

batch_size=4
initial_epoch = 0

if train:

x_train = y_train

xgen = xgen.flow(x_train, batch_size=batch_size, seed=seed)
# class mygen:
#     def __next__(self):
#         yield sm.backbones.get_preprocessing(BACKBONE)(xgen.next())
ygen = ygen.flow(y_train, batch_size=batch_size, seed=seed)
train_generator = zip(xgen, ygen)

subset = round(epochs / saves)
for current_epoch in range(subset):
    model.fit_generator(train_generator, #class_weight={0: 1, 1: 5, 2: 1},
                    steps_per_epoch=len(x_train) / batch_size, epochs=saves, verbose = 1, initial_epoch=initial_epoch)`

input shape

Hi
Can you provide us with a simple example of unet (or fpn ...).
I couldn't figure out how to build the model and specify my input data.
How should I shape my data? I tried x32 for H and W but it's not working

Retraining the model

When I attempt to train the model, it throws up the error below:

ValueError: Error when checking target: expected sigmoid to have 4 dimensions, but got array with shape (10628, 1)

What format do the labels need to be in, currently I have a list of integers ranging from 0 - 9 which represent a specific class. Am I right to think that instead the labels need to be images with bounding boxes that show where the item is in the image.

How to do the segmentation task since the size of output is arbitrary?

Because the number of object of interest in a picture is arbitrary, the output size will be change with the input picture change.

And the second question is that how to tell different regions corresponding to different entities/objects in the picture?

Thank you!

Is FPN a segmentation model?

Can it achieve segmentation???

Preprocessing

For the preprocessing for ResNet models (switching RGB images to BGR):

Just to confirm, the pre-trained resnet models are trained on BGR?
Do the pretrained resnet models expect images to be zero mean & unit variance also, or not?

Another question: when fine tuning, how important is the 2 epoch freeze first? important/not important?

Thank you

Error in mobilenetv2 preprocess_input

First of all, thank you for sharing nice code. I found it very useful for one of my projects.

I found a problem in get_preprocessing for 'mobilenet' and 'mobilenetv2'
Simple code:

from segmentation_models.backbones import get_preprocessing
import numpy as np
img = np.zeros((32, 32, 3))
preprocess_input = get_preprocessing('mobilenetv2') # or 'mobilenet' gives the same error
img = preprocess_input(img.astype(np.float32))

Error message:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/path/to/segmentation_models/backbones/mobilenetv2.py", line 92, in preprocess_input
    return imagenet_utils.preprocess_input(x, mode='tf', **kwargs)
  File "/path/to/lib/python3.5/site-packages/keras_applications/imagenet_utils.py", line 186, in preprocess_input
    data_format = backend.image_data_format()
AttributeError: 'NoneType' object has no attribute 'image_data_format'

Note that no such error occured when I tried the same code with other backbone models like 'resnet18' or 'vgg19' etc. There were two ways to get rid of this error:
Method 1: import Classifiers from classification_models

from classification_models import Classifiers
import numpy as np
img = np.zeros((32, 32, 3))
preprocess_input = Classifiers.get_preprocessing('mobilenetv2') # or 'mobilenet'
img = preprocess_input(img.astype(np.float32))

Method 2: specify backend keyword argument

import keras
from segmentation_models.backbones import get_preprocessing
import numpy as np
img = np.zeros((32, 32, 3))
preprocess_input = get_preprocessing('mobilenetv2') # or 'mobilenet'
img = preprocess_input(img.astype(np.float32), backend=keras.backend)

So the original preprocessing function for mobilenet(v2) works fine but the updated one in Line 15~16 in segmentation_models/backbones/__init__.py is the problematic part.

Although not a big problem, it is still annoying as you have to treat the case for mobilenet(v2) separately when you write down a code for general backbones. I'll be more than happy if you consider fixing this issue. Thanks!

ValueError: input shape

Hello,

Here is the code I have:
model = segmentation_models.Unet(backbone_name='vgg19', encoder_weights='imagenet', input_shape=images[0].shape) model.compile('Adam', 'binary_crossentropy', ['binary_accuracy']) model.fit(images, seg_maps, epochs=2)

And I get the following error:

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 36, 24, 512), (None, 37, 25, 512)]

Shapes are:

images: (600, 400, 3)
seg_maps: (600, 400, 1)

Thank you for you help on this really nice project.

Why first and last layers of ResNet are not feeded to FPN?

I've encountered that here:
https://github.com/qubvel/segmentation_models/blob/master/segmentation_models/fpn/model.py#L10
you are extracting outputs from the very begining of each Resnet layer to feed decoder. It results in ignoring the whole last Resnet layer (only BN and activation is taken from it).
Is where a reason for it?
Also data from high-resolution layer of ResNet (before the first MaxPool) is not used, with results in need to upsample FNN results by 4.
Is it like in original paper?

Error while importing "from segmentation_models import FPN" or any other model

While importing FPN or any other available model I get the error :f'Conv2DTranspose support only upsample_rate=(2, 2), got {upsample_rate}'). Could you please look into this.
Tensorflow version:1.12
Keras version:2.2.4
Thank you!!

IoU metric

In the docs of iou_score you state only that predictions are of the shape (B, H, W, C) but you state nothing about the values in predictions. Are those probabilities (let's say [0.2, 0.1, 0., 0.7] for one pixel and the case with 4 class) or the one-hot encoded vectors ([0, 0, 0, 1] for the same pixel and the same number of classes)?

Because the result is not the same.
Let's say that the ground truth in this example is: [0, 1, 0, 0].
In the first case, the intersection is 0.2, but in the second case, it is 0. And 0 is the correct intersection value.

Actually, I'm not sure how to interpret it if those are probabilities ...

Loading the trained model

I trained a FPN model with vgg16 backbone for multi class segmentation(classes=3).While loading the trained model I get the error: "ValueError: Unknown layer: ResizeImage".
Thank you

minor typo in documentation

Hi, thank you for this great tool. It's proving very useful in my experiments. Just wanted to alert you to what I think are some small typos in the README.rst documentation.

Under the table in the Backbones section, the second string for type ResNeXt is 'resnet101' when it should be 'resnext101' and the second string for type SE-ResNeXt is 'seresnet101' when it should be 'seresnext101'. I think.

Very small thing! Just wanted to point it out. Again, thank you for building this.

problem with input

Hello! I have some problems with using unet tamplate
If I try to use unet with
x, y = X_train,y_train
(X_train contains np.array of images as well as y_train and their shapes are (10,512, 512, 3) - 10 is count of samples 512x512 is resolution and 3 is rgb)
But I get this error :Error when checking target: expected sigmoid to have shape (None, None, 1) but got array with shape (512, 512, 3)