qubvel / segmentation_models.pytorch Goto Github PK
View Code? Open in Web Editor NEWSegmentation models with pretrained backbones. PyTorch.
License: MIT License
Segmentation models with pretrained backbones. PyTorch.
License: MIT License
import segmentation_models_pytorch as smp
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp
~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
2 from .linknet import Linknet
3 from .fpn import FPN
4 from .pspnet import PSPNet
5
~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet
~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
1 from .decoder import UnetDecoder
2 from ..base import EncoderDecoder
----> 3 from ..encoders import get_encoder
4
5
~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/__init__.py in <module>
3 from .resnet import resnet_encoders
4 from .dpn import dpn_encoders
----> 5 from .vgg import vgg_encoders
6 from .senet import senet_encoders
7 from .densenet import densenet_encoders
~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
2 from torchvision.models.vgg import VGG
3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
5 from pretrainedmodels.models.torchvision_models import pretrained_settings
6
ImportError: cannot import name 'cfg'
If I have four class,
class DiceLoss(nn.Module):
__name__ = 'dice_loss'
def __init__(self, eps=1e-7, activation='softmax2d'):
super().__init__()
self.activation = activation
self.eps = eps
def forward(self, y_pr, y_gt):
return 1 - F.f_score(y_pr, y_gt, beta=1., eps=self.eps, threshold=None, activation=self.activation)
activation='softmax2d' should be set 'softmax2d' by myself?
I am using UNet and res50 encoder now working on medical dataset with 7 classes. I noticed that my IOU score and f_score is pretty low during the whole training stage. Maybe it's because there exists class imbalance in my dataset and sometimes the background(which does not belongs to any classes) dominates an input image. Thanks for your work and I have 2 questions here, should I consider pixels with no labels as a class? Because in pixel_CE loss the target array should be 1 channel and all BG pixeles should be labeled with 0(a class number). Another question is what loss funciton should I use considering my dataset is unique and the IOU and f_score keeps low.. and how does the selection of activation functions influence the calculation of iou and f_score?
Thanks
It looks like it is sigmoid or softmax applied to the output layer of the UNet architecture.
It would be nice to have an option not to apply anything.
The workaround is:
def activation(x): x
model = smp.Unet('resnet34', encoder_weights='imagenet', activation=activation)
But it would be nice to do it without this hack.
Hi, there is an error.
~/anaconda3/envs/dl/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
2 from torchvision.models.vgg import VGG
3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
ImportError: cannot import name 'cfg' from 'torchvision.models.vgg' (/Users/anaconda3/envs/dl/lib/python3.7/site-packages/torchvision/models/vgg.py)
my pytorch version is 1.2.0 and torchvision version is 0.4.0
Hello, first of all thanks for an awesome library.
I am doing semantic segmentation for one class, in the https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation competition.
When I try to use this loss I got an error.
I even tried smp.utils.losses.nn.BCEWithLogitsLoss()
.
I got the error 'BCEWithLogitsLoss' object has no attribute '__name__'
.
Thanks.
what's wrong with resnext50_32x4d?
model = smp.Unet("resnext50_32x4d", encoder_weights="imagenet", classes=4, activation='sigmoid')
error:
KeyError: 'resnext50_32x4d'
However, it's defined here:
https://github.com/qubvel/segmentation_models.pytorch/blob/master/segmentation_models_pytorch/encoders/resnet.py#L85
Thanks for your effort.
Could you please show how to build it from source?
I have slightly modified script from car segmentation ipynb file for my own binary segmentation mask:
If I use same training loop for resnet 18 I get very high IUO even higher than 1 which impossible.
Using evaluation gives:
test_epoch = ValidEpoch(
model=best_model,
loss=loss,
metrics=metrics,
device=DEVICE,
)
40/40 [00:06<00:00, 7.23it/s, bce_dice_loss - -3.208e+03, iou - 1.163, f-score - 0.9803]
But when I manually score on validation - much much lower result:
for i in range(40):
n = i#np.random.choice(len(valid_dataset))
image_vis = imread(valid_dataset.images_fps[n])
image, gt_mask = valid_dataset[n]
image, gt_mask = image, gt_mask.transpose(1,2,0)
gt_mask = gt_mask.squeeze()
x_tensor = torch.from_numpy(image).to(DEVICE).unsqueeze(0)
pr_mask = best_model.predict(x_tensor)
pr_mask = (pr_mask.squeeze().cpu().numpy() > 0.5).astype(np.uint8)
score = iou(torch.from_numpy(gt_mask).float().to(DEVICE), torch.from_numpy(pr_mask).float().to(DEVICE)).data.cpu().numpy().max()
#rint(score)
scores.append(score)
IOU 0.07_
Then I tried to change preprocessing to recalling by dividing by 255 - model just learn - IOU near 0.0007 - with almost same result if I measure it manually.
Using Keras achieved almost IOU 0.5 with same almost same pipeline
Hi, friend , I'm new to semantic segmentation
, so I have to understand and test every step in your example code cars segmentation (camvid).ipynb
,
Can you tell me the dataset's name , so I can download it and run this example convenient.
Thanks for your help.
Now we have a way to get a preprocess_input
function
In [2]: preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')
In [3]: preprocess_input
Out[3]: functools.partial(<function preprocess_input at 0x7efc1b644400>, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], input_space='RGB', input_range=[0, 1])
I would like to be able, for a given encoder to get a dictionary with parameters:
mean
, std
, input_space
, input_range
Hi,
I need to modify models by adding skip connections between encoder layers and decoder layers like this
x = input
x = self.layer0[-1](x)
x = x + input
x1 = self.layer1(x)
x1 = x1 + x
x2 = self.layer2(x1)
x2 = x2 + x1
x3 = self.layer3(x2)
x3 = x3 + x2
x4 = self.layer4(x3)
x4 = x4 + x3
I've tried implementing this and encountered the problem which dimensions after and before passing layers are unequal and the values and can't be added.
Is it possible to implement this? and How to implement it?
Thank you.
You are using dice + bce
. But your dice
is calculated as 1 - F.f_score
. Should it be 1 - F.dice_coef
?
Hello, thanks for sharing the code, it's rather convenient to deal with the segmentation tasks.
When I use it for training my model, I have a confusion with the train and valid operations.
As the following said,
train_logs = train_epoch.run(train_loader)
valid_logs = valid_epoch.run(valid_loader)
since the train_epoch and valid epoch are created from smp.utils.train.ValidEpoch and smp.utils.train.TrainEpoch separately, however, the valid_epoch instance could use the weights obtained by train_epoch instance for validation. Maybe I ignore some key points.
So how do they share the model weight with each other?
Warm Regard.
I used se_resnext50_32x4d encoder and Unet decoder to train the segmentation model by following the CamVid tutorial on the webpage
However, the pr_mask is not correct although the accuracy on CamVid test set is very high(iou - 0.7417)
valid: 100%|████████████████████████████████████████████| 233/233 [00:10<00:00, 21.86it/s, bce_dice_loss - 0.3587, iou - 0.7417, f-score - 0.8195]
The trained model and codes can be found here
https://drive.google.com/drive/folders/0B6X3_r_lRbVUODRiM2RhMWMtNDk4NC00NmM1LWEyODEtMTdjNDI5MWJiMmQ4
When I update the torchvison lib to version 0.3, I meet this problem:
ImportError: cannot import name 'cfg' from 'torchvision.models.vgg'
And torchvision 0.3 add some pretrained segmentation models, can them be added into smp APIs?
In Remote sensing, the image usually has more than three channels. For example, the image has NIR ,R ,G and B. I want to leveraged on the pretrained weights for a standard ResNet50 and transferred them to a 4-channel input version by copying RGB weights + the NIR weight as equivalent to the red channel.I have solve this in my code, but I have no idea about this. How to solve it in your code?Thank you!
Here is my solution in my code:
`model = models.resnet50(pretrained=True)
weight = model.conv1.weight.clone()
model.conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2, padding=3, bias=False)
with torch.no_grad():
model.conv1.weight[:, :3] = weight
model.conv1.weight[:, 3] = model.conv1.weight[:, 0]
x = torch.randn(10, 4, 224, 224)
output = model(x)`
I am currently using a UNET from this package and have some weird errors with respect to image size that I don't fully understand.
If I use image size of 320x480 everything works fine, but when I switch to e.g., 350x525 I get the following error. Certain image size seem to work, certain don't seem to work. Like 640x960 works, but 160x240 does not. Any ideas?
~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/base/encoder_decoder.py in forward(self, x)
23 """Sequentially pass `x` trough model`s `encoder` and `decoder` (return logits!)"""
24 x = self.encoder(x)
---> 25 x = self.decoder(x)
26 return x
27
~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
--> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)
~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
93 encoder_head = self.center(encoder_head)
94
---> 95 x = self.layer1([encoder_head, skips[0]])
96 x = self.layer2([x, skips[1]])
97 x = self.layer3([x, skips[2]])
~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
--> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)
~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
26 x = F.interpolate(x, scale_factor=2, mode='nearest')
27 if skip is not None:
---> 28 x = torch.cat([x, skip], dim=1)
29 x = self.attention1(x)
30
~/anaconda3/envs/xxx/lib/python3.7/site-packages/apex/amp/wrap.py in wrapper(seq, *args, **kwargs)
83 cast_seq = utils.casted_args(maybe_float,
84 seq, {})
---> 85 return orig_fn(cast_seq, *args, **kwargs)
86 else:
87 # TODO: other mixed-type cases aren't due to amp.
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 33 and 34 in dimension 3 at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMath.cu:71
File "/home/lib/python3.7/site-packages/segmentation_models_pytorch/utils/functions.py", line 69, in f_score
tp = torch.sum(gt * pr) RuntimeError: expected backend CUDA and dtype Double but got backend CUDA and dtype Float
Hi,
I am training unet for multi-class segmentation problem. There are 3 classes and I would like to update weights based on class 1 and 2 (leave out the background class 0).
I call the loss function as -
MulticlassJaccardLoss(weight=[2,10], classes=[1,2], from_logits=False)
My MulticlassJaccardLoss class-
class MulticlassJaccardLoss(_Loss):
"""Implementation of Jaccard loss for multiclass (semantic) image segmentation task
"""
__name__ = 'mc_jaccard_loss'
def __init__(self, classes: List[int] = None, from_logits=True, weight=[2,6], reduction='elementwise_mean'):
super(MulticlassJaccardLoss, self).__init__(reduction=reduction)
self.classes = classes
self.from_logits = from_logits
self.weight = weight
def forward(self, y_pred: Tensor, y_true: Tensor) -> Tensor:
"""
:param y_pred: NxCxHxW
:param y_true: NxHxW
:return: scalar
"""
if self.from_logits:
y_pred = y_pred.softmax(dim=1)
n_classes = y_pred.size(1)
smooth = 1e-3
if self.classes is None:
classes = range(n_classes)
else:
classes = self.classes
n_classes = len(classes)
loss = torch.zeros(n_classes, dtype=torch.float, device=y_pred.device)
print(loss.shape)
if self.weight is None:
weights = [1] * n_classes
else:
weights = self.weight
for class_index, weight in zip(classes, weights):
jaccard_target = (y_true == class_index).float()
jaccard_output = y_pred[:, class_index, ...]
num_preds = jaccard_target.long().sum()
if num_preds == 0:
loss[class_index-1] = 0 #custom
else:
iou = soft_jaccard_score(jaccard_output, jaccard_target, from_logits=False, smooth=smooth)
loss[class_index-1] = (1.0 - iou) * weight #custom
if self.reduction == 'elementwise_mean':
return loss.mean()
if self.reduction == 'sum':
return loss.sum()
return loss
When I train the model, the model gets trained for few iterations and I get the following error,
element 0 of tensors does not require grad and does not have a grad_fn
when I try to import the module, I get an error:
cannot import name 'cfg'
want help!!!
I think 'renset18' should be modified to 'resnet18'.
how can i unfreeze the layers of vgg16? i see your solution of this problem in segmentation model keras repository but not here for pytorch layer.trainable doesn't work here,any example please? and how many layers of vgg16 can be unfreezed while training with unet decoder for segmentation task? thanks a ton in advance
ENCODER = 'se_resnext50_32x4d'
ENCODER_WEIGHTS = 'imagenet'
DEVICE = 'cuda'
RuntimeError: invalid hash value (expected "a260b3a4", got "dc315dde03a64a11145b0aa4c61a29403a7b709376bcba910c851f0115d81a04")
It seems like the model should not be initialized the way the same as trainning process when testing,I tried comment the train_log in the trainning for-loop expecting testing will be done without trainning, but the visualisition results shows 0,1 inference. I guess maybe the initializition of model of ENCODER may cover the trainned weights or somehow. so how to use the trainned weights in the right way?
sincerely!
File "/home/Disk0/SMP/segmentation_models_pytorch/encoders/resnet.py", line 4, in
from pretrainedmodels.models.torchvision_models import pretrained_settings
Where are the pretrainedmodels?
thx
IF I have only one class lable,lable is the Black-and-white map.This is the simplest question, but it confuses me in a complex framework.
Looking forward to your suggestions
I have multispectral image dataset (each image has eight channels) and want to feed them into the pretrained unet. In this case, how to modify the network so that the initialization of the first layer from the pretrained weight can be ignored. Otherwise, there will have error for the first layer's initialization due to the different input size of two models. Thanks.
input size is 224*224?
I downloaded new version of library from source :
pip install git+https://github.com/qubvel/segmentation_models.pytorch
And now i have this problem :
TypeError: __init__() got an unexpected keyword argument 'groups'
Full error :
File "segmentation_model.py", line 235, in <module>
defect_crop=args.defect_crop)
File "segmentation_model.py", line 166, in train
model = get_model(model_name=model_name, encoder_name=encoder).to(device)
File "segmentation_model.py", line 98, in get_model
model = FPN(encoder_name=encoder_name, classes=4, activation='sigmoid', encoder_weights=encoder_weights)
File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/fpn/model.py", line 39, in __init__
encoder_weights=encoder_weights
File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/__init__.py", line 24, in get_encoder
encoder = Encoder(**encoders[name]['params'])
File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/resnet.py", line 10, in __init__
super().__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'groups'
First of all, amazing work in both of your segmentation libraries. It saved me a lot of time.
I want to train PSPNET using single channel grayscale image but not able to figure out how to do it. In your keras documentation you have already mentioned the same. It will be really helpful if you could suggest the same here.
Thanks
Hello,my image has 4 channels ,what should I do to use this model ?
Hi,
I'm working in a internet restricted system. I've installed segmentation_models.pytorch using source code using pip install ..
Now when I try to import it, I get following error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp
/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
2 from .linknet import Linknet
3 from .fpn import FPN
4 from .pspnet import PSPNet
5
/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet
/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
----> 1 from .decoder import UnetDecoder
2 from ..base import EncoderDecoder
3 from ..encoders import get_encoder
4
5
/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/decoder.py in <module>
3 import torch.nn.functional as F
4
----> 5 from ..common.blocks import Conv2dReLU
6 from ..base.model import Model
7
ModuleNotFoundError: No module named 'segmentation_models_pytorch.common.blocks'
Any ideas how this error can be solved?
Why do we have a constraint on the torchvision version?
https://github.com/qubvel/segmentation_models.pytorch/blob/master/requirements.txt#L1
Sorry for wrongly clicked, my issue would be stated other where.
In your example you use Sigmoid activation at the end of the Unet. But at the same time all your losses and metrics are counted using one more sigmoid activation. Is it appropriate to use sigmoid activation twice?
First, I would like to thank you for making a great project.
I found that you added EfficientNet to your Keras project and I was wondering if you could add that to this Pytorch project too?
Hi! Thanks for repo owner's contribution! This repository is useful and benefits lots of people!
I would like to discuss the implementation of FPN in this repo with the people watching on this repo.
According to this document, I think page 25 suggesting that we should use concatenation instead of summation if I did not misunderstand the page.
def forward(self, x):
c5, c4, c3, c2, _ = x
p5 = self.conv1(c5)
p4 = self.p4([p5, c4])
p3 = self.p3([p4, c3])
p2 = self.p2([p3, c2])
s5 = self.s5(p5)
s4 = self.s4(p4)
s3 = self.s3(p3)
s2 = self.s2(p2)
# use concatenation instead of summation?
# x = s5 + s4 + s3 + s2
x = torch.cat([s5, s4, s3, s2], dim=1)
x = self.dropout(x)
x = self.final_conv(x)
x = F.interpolate(x, scale_factor=4, mode='bilinear', align_corners=True)
return x
It would be nice to make upsampling in
optional.
It would be great if there would be an option to use mobilenet as encoder
I wasn't able to get the example with the cars segmentation to run.
Changing the code in unet/decoder.py for the class DecoderBlock worked for me:
def forward(self, x):
x, skip = x
if skip is not None:
x = F.interpolate(x, size=(skip.shape[-2], skip.shape[-1]), mode='nearest')
x = torch.cat([x, skip], dim=1)
else:
x = F.interpolate(x, scale_factor=2, mode='nearest')
x = self.block(x)
return x
Thanks to your great efforts!
Could you offer a tutorial about training the own data using this framework?
Best wishes!
https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss expects log probabilities.
When I run "cars segmentation (camvid).ipynb", there was an error: AttributeError: Can't pickle local object 'get_preprocessing_fn.._preprocess_input'
This error is in the "train model for 40 epochs"
Thanks
Hi qubvel, i am confused by the i was confused by loss function forward params.
As show in utils/functions.py, the def iou and f_score calculate IOU loss and DICE loss with pr and gt as params, and does it means pr (torch.Tensor) is a tensor with shape[batch, channel, width, height]? but why the annotation is "A list of predicted elements" as follows, i'm confused with what is the pr, tensor, with shape[batch, channel, width, height] or list with element num batch * channel * width * height, which one?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.