Code Monkey home page Code Monkey logo

segmentation_models.pytorch's Introduction

qubvel

linux docker python keras tensorflow pytorch

 qubvel

segmentation_models.pytorch's People

Contributors

aarsh2001 avatar abd-elr4hman avatar alaydshah avatar azkalot1 avatar calebrob6 avatar cmamba avatar daiwt avatar dependabot[bot] avatar gracikk-ds avatar ilyadobrynin avatar julienmaille avatar kaczmarj avatar kevinpl07 avatar khornlund avatar kupchanski avatar kyle1993 avatar laol777 avatar lizmisha avatar loopdigga96 avatar ludics avatar michaelmonashev avatar nitzanmadar avatar nmerty avatar qubvel avatar remram44 avatar siarheifedartsou avatar thisisiron avatar vozf avatar wamawama avatar zurk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

segmentation_models.pytorch's Issues

Bug with `import segmentation_models_pytorch as smp`

 import segmentation_models_pytorch as smp                                                                                                                                                                  
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
      2 from .linknet import Linknet
      3 from .fpn import FPN
      4 from .pspnet import PSPNet
      5 

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
      1 from .decoder import UnetDecoder
      2 from ..base import EncoderDecoder
----> 3 from ..encoders import get_encoder
      4 
      5 

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/__init__.py in <module>
      3 from .resnet import resnet_encoders
      4 from .dpn import dpn_encoders
----> 5 from .vgg import vgg_encoders
      6 from .senet import senet_encoders
      7 from .densenet import densenet_encoders

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
      2 from torchvision.models.vgg import VGG
      3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
      5 from pretrainedmodels.models.torchvision_models import pretrained_settings
      6 

ImportError: cannot import name 'cfg'

multiclass

If I have four class,

class DiceLoss(nn.Module):
    __name__ = 'dice_loss'

    def __init__(self, eps=1e-7, activation='softmax2d'):
        super().__init__()
        self.activation = activation
        self.eps = eps

    def forward(self, y_pr, y_gt):
        return 1 - F.f_score(y_pr, y_gt, beta=1., eps=self.eps, threshold=None, activation=self.activation)

activation='softmax2d' should be set 'softmax2d' by myself?

Any difference between sigmoid and softmax activation when using f_score and IOU metrics?

I am using UNet and res50 encoder now working on medical dataset with 7 classes. I noticed that my IOU score and f_score is pretty low during the whole training stage. Maybe it's because there exists class imbalance in my dataset and sometimes the background(which does not belongs to any classes) dominates an input image. Thanks for your work and I have 2 questions here, should I consider pixels with no labels as a class? Because in pixel_CE loss the target array should be 1 channel and all BG pixeles should be labeled with 0(a class number). Another question is what loss funciton should I use considering my dataset is unique and the IOU and f_score keeps low.. and how does the selection of activation functions influence the calculation of iou and f_score?
Thanks

None activation for UNet output.

It looks like it is sigmoid or softmax applied to the output layer of the UNet architecture.

It would be nice to have an option not to apply anything.

The workaround is:

 def activation(x): x

model = smp.Unet('resnet34', encoder_weights='imagenet', activation=activation)

But it would be nice to do it without this hack.

cannot import name 'cfg' from 'torchvision.models.vgg'

Hi, there is an error.

~/anaconda3/envs/dl/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
      2 from torchvision.models.vgg import VGG
      3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
ImportError: cannot import name 'cfg' from 'torchvision.models.vgg' (/Users/anaconda3/envs/dl/lib/python3.7/site-packages/torchvision/models/vgg.py)

my pytorch version is 1.2.0 and torchvision version is 0.4.0

IOU metric sometimes bigger than 1

I have slightly modified script from car segmentation ipynb file for my own binary segmentation mask:

If I use same training loop for resnet 18 I get very high IUO even higher than 1 which impossible.

Using evaluation gives:

test_epoch = ValidEpoch(
    model=best_model,
    loss=loss,
    metrics=metrics,
    device=DEVICE,
)
 40/40 [00:06<00:00,  7.23it/s, bce_dice_loss - -3.208e+03, iou - 1.163, f-score - 0.9803] 

But when I manually score on validation - much much lower result:

for i in range(40):
    n = i#np.random.choice(len(valid_dataset))
    image_vis = imread(valid_dataset.images_fps[n])    
    image, gt_mask = valid_dataset[n]
    image, gt_mask = image, gt_mask.transpose(1,2,0)
    gt_mask = gt_mask.squeeze()
    
    x_tensor = torch.from_numpy(image).to(DEVICE).unsqueeze(0)
    pr_mask = best_model.predict(x_tensor)
    pr_mask = (pr_mask.squeeze().cpu().numpy() > 0.5).astype(np.uint8)
    score = iou(torch.from_numpy(gt_mask).float().to(DEVICE), torch.from_numpy(pr_mask).float().to(DEVICE)).data.cpu().numpy().max()
    #rint(score)
    scores.append(score)
    IOU 0.07_

Then I tried to change preprocessing to recalling by dividing by 255 - model just learn - IOU near 0.0007 - with almost same result if I measure it manually.

Using Keras achieved almost IOU 0.5 with same almost same pipeline

Which dataset do you use in example.ipynb?

Hi, friend , I'm new to semantic segmentation , so I have to understand and test every step in your example code cars segmentation (camvid).ipynb,
Can you tell me the dataset's name , so I can download it and run this example convenient.
Thanks for your help.

Feature request: preprocess_input as a dictionary

Now we have a way to get a preprocess_input function

In [2]: preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')                                                                                                                                 

In [3]: preprocess_input                                                                                                                                                                                           
Out[3]: functools.partial(<function preprocess_input at 0x7efc1b644400>, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], input_space='RGB', input_range=[0, 1])

I would like to be able, for a given encoder to get a dictionary with parameters:

mean, std, input_space, input_range

[Ask for advice] Modify models by adding skip connections

Hi,

I need to modify models by adding skip connections between encoder layers and decoder layers like this

x = input  
x = self.layer0[-1](x)  
x = x + input  
x1 = self.layer1(x)  
x1 = x1 + x  
x2 = self.layer2(x1)  
x2 = x2 + x1  
x3 = self.layer3(x2)  
x3 = x3 + x2  
x4 = self.layer4(x3)  
x4 = x4 + x3  

I've tried implementing this and encountered the problem which dimensions after and before passing layers are unequal and the values and can't be added.

Is it possible to implement this? and How to implement it?

Thank you.

confused by BCEDiceLoss

You are using dice + bce . But your dice is calculated as 1 - F.f_score. Should it be 1 - F.dice_coef ?

confusion with the train and valid operations

Hello, thanks for sharing the code, it's rather convenient to deal with the segmentation tasks.
When I use it for training my model, I have a confusion with the train and valid operations.
As the following said,

train_logs = train_epoch.run(train_loader)
 valid_logs = valid_epoch.run(valid_loader)

since the train_epoch and valid epoch are created from smp.utils.train.ValidEpoch and smp.utils.train.TrainEpoch separately, however, the valid_epoch instance could use the weights obtained by train_epoch instance for validation. Maybe I ignore some key points.

So how do they share the model weight with each other?

Warm Regard.

pr_mask is not correct although the accuracy on CamVid test set is very high

I used se_resnext50_32x4d encoder and Unet decoder to train the segmentation model by following the CamVid tutorial on the webpage

However, the pr_mask is not correct although the accuracy on CamVid test set is very high(iou - 0.7417)

valid: 100%|████████████████████████████████████████████| 233/233 [00:10<00:00, 21.86it/s, bce_dice_loss - 0.3587, iou - 0.7417, f-score - 0.8195]

image

The trained model and codes can be found here

https://drive.google.com/drive/folders/0B6X3_r_lRbVUODRiM2RhMWMtNDk4NC00NmM1LWEyODEtMTdjNDI5MWJiMmQ4

how to transfer a standard pretrained resnet model to 4-channel in your code?

In Remote sensing, the image usually has more than three channels. For example, the image has NIR ,R ,G and B. I want to leveraged on the pretrained weights for a standard ResNet50 and transferred them to a 4-channel input version by copying RGB weights + the NIR weight as equivalent to the red channel.I have solve this in my code, but I have no idea about this. How to solve it in your code?Thank you!
Here is my solution in my code:
`model = models.resnet50(pretrained=True)
weight = model.conv1.weight.clone()
model.conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2, padding=3, bias=False)
with torch.no_grad():
model.conv1.weight[:, :3] = weight
model.conv1.weight[:, 3] = model.conv1.weight[:, 0]

x = torch.randn(10, 4, 224, 224)
output = model(x)`

Image size issues

I am currently using a UNET from this package and have some weird errors with respect to image size that I don't fully understand.

If I use image size of 320x480 everything works fine, but when I switch to e.g., 350x525 I get the following error. Certain image size seem to work, certain don't seem to work. Like 640x960 works, but 160x240 does not. Any ideas?

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/base/encoder_decoder.py in forward(self, x)
     23         """Sequentially pass `x` trough model`s `encoder` and `decoder` (return logits!)"""
     24         x = self.encoder(x)
---> 25         x = self.decoder(x)
     26         return x
     27 

~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
     93             encoder_head = self.center(encoder_head)
     94 
---> 95         x = self.layer1([encoder_head, skips[0]])
     96         x = self.layer2([x, skips[1]])
     97         x = self.layer3([x, skips[2]])

~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
     26         x = F.interpolate(x, scale_factor=2, mode='nearest')
     27         if skip is not None:
---> 28             x = torch.cat([x, skip], dim=1)
     29             x = self.attention1(x)
     30 

~/anaconda3/envs/xxx/lib/python3.7/site-packages/apex/amp/wrap.py in wrapper(seq, *args, **kwargs)
     83             cast_seq = utils.casted_args(maybe_float,
     84                                          seq, {})
---> 85             return orig_fn(cast_seq, *args, **kwargs)
     86         else:
     87             # TODO: other mixed-type cases aren't due to amp.

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 33 and 34 in dimension 3 at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMath.cu:71

error when using fscore

File "/home/lib/python3.7/site-packages/segmentation_models_pytorch/utils/functions.py", line 69, in f_score
tp = torch.sum(gt * pr) RuntimeError: expected backend CUDA and dtype Double but got backend CUDA and dtype Float

multiclass jaccard loss

Hi,

I am training unet for multi-class segmentation problem. There are 3 classes and I would like to update weights based on class 1 and 2 (leave out the background class 0).

I call the loss function as -

MulticlassJaccardLoss(weight=[2,10], classes=[1,2], from_logits=False)

My MulticlassJaccardLoss class-

class MulticlassJaccardLoss(_Loss):
    """Implementation of Jaccard loss for multiclass (semantic) image segmentation task
    """
    __name__ = 'mc_jaccard_loss'
    def __init__(self, classes: List[int] = None, from_logits=True, weight=[2,6], reduction='elementwise_mean'):
        super(MulticlassJaccardLoss, self).__init__(reduction=reduction)
        self.classes = classes
        self.from_logits = from_logits
        self.weight = weight

    def forward(self, y_pred: Tensor, y_true: Tensor) -> Tensor:
        """
        :param y_pred: NxCxHxW
        :param y_true: NxHxW
        :return: scalar
        """
        if self.from_logits:
            y_pred = y_pred.softmax(dim=1)

        n_classes = y_pred.size(1)
        smooth = 1e-3
        
        if self.classes is None:
            classes = range(n_classes)
        else:
            classes = self.classes
            n_classes = len(classes)

        loss = torch.zeros(n_classes, dtype=torch.float, device=y_pred.device)
        print(loss.shape)
        

        if self.weight is None:
            weights = [1] * n_classes
        else:
            weights = self.weight

        for class_index, weight in zip(classes, weights):

            jaccard_target = (y_true == class_index).float()
            jaccard_output = y_pred[:, class_index, ...]

            num_preds = jaccard_target.long().sum()

            if num_preds == 0:
                loss[class_index-1] = 0 #custom
            else:
                iou = soft_jaccard_score(jaccard_output, jaccard_target, from_logits=False, smooth=smooth)
                loss[class_index-1] = (1.0 - iou) * weight #custom

        if self.reduction == 'elementwise_mean':
            return loss.mean()

        if self.reduction == 'sum':
            return loss.sum()

        return loss

When I train the model, the model gets trained for few iterations and I get the following error,

element 0 of tensors does not require grad and does not have a grad_fn

how can i unfreeze the layers of vgg16/vgg11 encoder used with unet decoder?

how can i unfreeze the layers of vgg16? i see your solution of this problem in segmentation model keras repository but not here for pytorch layer.trainable doesn't work here,any example please? and how many layers of vgg16 can be unfreezed while training with unet decoder for segmentation task? thanks a ton in advance

invalid hash value

ENCODER = 'se_resnext50_32x4d'
ENCODER_WEIGHTS = 'imagenet'
DEVICE = 'cuda'
RuntimeError: invalid hash value (expected "a260b3a4", got "dc315dde03a64a11145b0aa4c61a29403a7b709376bcba910c851f0115d81a04")

How to use the weights after trainning?

It seems like the model should not be initialized the way the same as trainning process when testing,I tried comment the train_log in the trainning for-loop expecting testing will be done without trainning, but the visualisition results shows 0,1 inference. I guess maybe the initializition of model of ENCODER may cover the trainned weights or somehow. so how to use the trainned weights in the right way?

sincerely!

IF I have only one class lable,how to do

IF I have only one class lable,lable is the Black-and-white map.This is the simplest question, but it confuses me in a complex framework.
Looking forward to your suggestions

How to handle the multispectral image (NOT RGB image)?

I have multispectral image dataset (each image has eight channels) and want to feed them into the pretrained unet. In this case, how to modify the network so that the initialization of the first layer from the pretrained weight can be ignored. Otherwise, there will have error for the first layer's initialization due to the different input size of two models. Thanks.

TypeError: __init__() got an unexpected keyword argument 'groups'

I downloaded new version of library from source :
pip install git+https://github.com/qubvel/segmentation_models.pytorch
And now i have this problem :
TypeError: __init__() got an unexpected keyword argument 'groups'

Full error :

  File "segmentation_model.py", line 235, in <module>
    defect_crop=args.defect_crop)
  File "segmentation_model.py", line 166, in train
    model = get_model(model_name=model_name, encoder_name=encoder).to(device)
  File "segmentation_model.py", line 98, in get_model
    model = FPN(encoder_name=encoder_name, classes=4, activation='sigmoid', encoder_weights=encoder_weights)
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/fpn/model.py", line 39, in __init__
    encoder_weights=encoder_weights
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/__init__.py", line 24, in get_encoder
    encoder = Encoder(**encoders[name]['params'])
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/resnet.py", line 10, in __init__
    super().__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'groups'

How to PSPNET train with single channels input?

First of all, amazing work in both of your segmentation libraries. It saved me a lot of time.
I want to train PSPNET using single channel grayscale image but not able to figure out how to do it. In your keras documentation you have already mentioned the same. It will be really helpful if you could suggest the same here.
Thanks

image shape

Hello,my image has 4 channels ,what should I do to use this model ?

No module named 'segmentation_models_pytorch.common.blocks'

Hi,

I'm working in a internet restricted system. I've installed segmentation_models.pytorch using source code using pip install ..
Now when I try to import it, I get following error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
      2 from .linknet import Linknet
      3 from .fpn import FPN
      4 from .pspnet import PSPNet
      5 

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
----> 1 from .decoder import UnetDecoder
      2 from ..base import EncoderDecoder
      3 from ..encoders import get_encoder
      4 
      5 

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/decoder.py in <module>
      3 import torch.nn.functional as F
      4 
----> 5 from ..common.blocks import Conv2dReLU
      6 from ..base.model import Model
      7 

ModuleNotFoundError: No module named 'segmentation_models_pytorch.common.blocks'


Any ideas how this error can be solved?

How to

Sorry for wrongly clicked, my issue would be stated other where.

Question about activation in example

In your example you use Sigmoid activation at the end of the Unet. But at the same time all your losses and metrics are counted using one more sigmoid activation. Is it appropriate to use sigmoid activation twice?

[Request] EfficientNet as a encoder

First, I would like to thank you for making a great project.
I found that you added EfficientNet to your Keras project and I was wondering if you could add that to this Pytorch project too?

Use concatenation for feature pyramid aggregation?

Hi! Thanks for repo owner's contribution! This repository is useful and benefits lots of people!

I would like to discuss the implementation of FPN in this repo with the people watching on this repo.
According to this document, I think page 25 suggesting that we should use concatenation instead of summation if I did not misunderstand the page.


    def forward(self, x):
        c5, c4, c3, c2, _ = x

        p5 = self.conv1(c5)
        p4 = self.p4([p5, c4])
        p3 = self.p3([p4, c3])
        p2 = self.p2([p3, c2])

        s5 = self.s5(p5)
        s4 = self.s4(p4)
        s3 = self.s3(p3)
        s2 = self.s2(p2)
       
        # use concatenation instead of summation?
        # x = s5 + s4 + s3 + s2
        x = torch.cat([s5, s4, s3, s2], dim=1)

        x = self.dropout(x)
        x = self.final_conv(x)

        x = F.interpolate(x, scale_factor=4, mode='bilinear', align_corners=True)
        return x

example doesn't work

I wasn't able to get the example with the cars segmentation to run.
Changing the code in unet/decoder.py for the class DecoderBlock worked for me:

def forward(self, x):
    x, skip = x
    if skip is not None:
        x = F.interpolate(x, size=(skip.shape[-2], skip.shape[-1]), mode='nearest')
        x = torch.cat([x, skip], dim=1)
    else:
        x = F.interpolate(x, scale_factor=2, mode='nearest')
    x = self.block(x)
    return x

confused by loss function forward params

Hi qubvel, i am confused by the i was confused by loss function forward params.

As show in utils/functions.py, the def iou and f_score calculate IOU loss and DICE loss with pr and gt as params, and does it means pr (torch.Tensor) is a tensor with shape[batch, channel, width, height]? but why the annotation is "A list of predicted elements" as follows, i'm confused with what is the pr, tensor, with shape[batch, channel, width, height] or list with element num batch * channel * width * height, which one?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.