Code Monkey home page Code Monkey logo

pytorch-msssim's Introduction

PyTorch MS-SSIM

License Downloads Latest Version

Fast and differentiable MS-SSIM and SSIM for pytorch.

Structural Similarity (SSIM):

Multi-Scale Structural Similarity (MS-SSIM):

Why it is faster than other versions?

Gaussian kernels used in SSIM & MS-SSIM are seperable. A separable filter in image processing can be written as product of two more simple filters. Typically a 2-dimensional convolution operation is separated into two 1-dimensional filters. This reduces the computational costs on an $N\times M$ image with a $m\times n$ filter from $\mathcal{O}(M\cdot N \cdot m \cdot n)$ down to $\mathcal{O}(M\cdot N \cdot (m+n))$. More importantly, seperated kernels are more contiguous and thus cache-friendly than 2-D kernels, which effectively accelerates the computing of SSIM/MS-SSIM.

Update

2023.05.26 (v1.0.0)

Just a release. Type hints by @iyume

2020.08.21 (v0.2.1)

3D image support from @FynnBe!

2020.04.30 (v0.2)

Now (v0.2), ssim & ms-ssim can produce consistent results as tensorflow and skimage. A benchmark (pytorch-msssim, tensorflow and skimage) can be found in the Tests section.

Installation

pip install pytorch-msssim

Usage

1. Basic Usage

from pytorch_msssim import ssim, ms_ssim, SSIM, MS_SSIM
# X: (N,3,H,W) a batch of non-negative RGB images (0~255)
# Y: (N,3,H,W)  

# calculate ssim & ms-ssim for each image
ssim_val = ssim( X, Y, data_range=255, size_average=False) # return (N,)
ms_ssim_val = ms_ssim( X, Y, data_range=255, size_average=False ) #(N,)

# set 'size_average=True' to get a scalar value as loss. see tests/tests_loss.py for more details
ssim_loss = 1 - ssim( X, Y, data_range=255, size_average=True) # return a scalar
ms_ssim_loss = 1 - ms_ssim( X, Y, data_range=255, size_average=True )

# reuse the gaussian kernel with SSIM & MS_SSIM. 
ssim_module = SSIM(data_range=255, size_average=True, channel=3) # channel=1 for grayscale images
ms_ssim_module = MS_SSIM(data_range=255, size_average=True, channel=3)

ssim_loss = 1 - ssim_module(X, Y)
ms_ssim_loss = 1 - ms_ssim_module(X, Y)

2. Normalized input

If you need to calculate MS-SSIM/SSIM on normalized images, please denormalize them to the range of [0, 1] or [0, 255] first.

# X: (N,3,H,W) a batch of normalized images (-1 ~ 1)
# Y: (N,3,H,W)  
X = (X + 1) / 2  # [-1, 1] => [0, 1]
Y = (Y + 1) / 2  
ms_ssim_val = ms_ssim( X, Y, data_range=1, size_average=False ) #(N,)

3. Enable nonnegative_ssim

For ssim, it is recommended to set nonnegative_ssim=True to avoid negative results. However, this option is set to False by default to keep it consistent with tensorflow and skimage.

For ms-ssim, there is no nonnegative_ssim option and the ssim reponses is forced to be non-negative to avoid NaN results.

Tests and Examples

cd tests

1. Benchmark

# requires tf2
python tests_comparisons_tf_skimage.py 

# or skimage only
# python tests_comparisons_skimage.py 

Outputs:

Downloading test image...
===================================
             Test SSIM
===================================
====> Single Image
Repeat 100 times
sigma=0.0 ssim_skimage=1.000000 (147.2605 ms), ssim_tf=1.000000 (343.4146 ms), ssim_torch=1.000000 (92.9151 ms)
sigma=10.0 ssim_skimage=0.932423 (147.5198 ms), ssim_tf=0.932661 (343.5191 ms), ssim_torch=0.932421 (95.6283 ms)
sigma=20.0 ssim_skimage=0.785744 (152.6441 ms), ssim_tf=0.785733 (343.4085 ms), ssim_torch=0.785738 (87.5639 ms)
sigma=30.0 ssim_skimage=0.636902 (145.5763 ms), ssim_tf=0.636902 (343.5312 ms), ssim_torch=0.636895 (90.4084 ms)
sigma=40.0 ssim_skimage=0.515798 (147.3798 ms), ssim_tf=0.515801 (344.8978 ms), ssim_torch=0.515791 (96.4440 ms)
sigma=50.0 ssim_skimage=0.422011 (148.2900 ms), ssim_tf=0.422007 (345.4076 ms), ssim_torch=0.422005 (86.3799 ms)
sigma=60.0 ssim_skimage=0.351139 (146.2039 ms), ssim_tf=0.351139 (343.4428 ms), ssim_torch=0.351133 (93.3445 ms)
sigma=70.0 ssim_skimage=0.296336 (145.5341 ms), ssim_tf=0.296337 (345.2255 ms), ssim_torch=0.296331 (92.6771 ms)
sigma=80.0 ssim_skimage=0.253328 (147.6655 ms), ssim_tf=0.253328 (343.1386 ms), ssim_torch=0.253324 (82.5985 ms)
sigma=90.0 ssim_skimage=0.219404 (142.6025 ms), ssim_tf=0.219405 (345.8275 ms), ssim_torch=0.219400 (100.9946 ms)
sigma=100.0 ssim_skimage=0.192681 (144.5597 ms), ssim_tf=0.192682 (346.5489 ms), ssim_torch=0.192678 (85.0229 ms)
Pass!
====> Batch
Pass!


===================================
             Test MS-SSIM
===================================
====> Single Image
Repeat 100 times
sigma=0.0 msssim_tf=1.000000 (671.5363 ms), msssim_torch=1.000000 (125.1403 ms)
sigma=10.0 msssim_tf=0.991137 (669.0296 ms), msssim_torch=0.991086 (113.4078 ms)
sigma=20.0 msssim_tf=0.967292 (670.5530 ms), msssim_torch=0.967281 (107.6428 ms)
sigma=30.0 msssim_tf=0.934875 (668.7717 ms), msssim_torch=0.934875 (111.3334 ms)
sigma=40.0 msssim_tf=0.897660 (669.0801 ms), msssim_torch=0.897658 (107.3700 ms)
sigma=50.0 msssim_tf=0.858956 (671.4629 ms), msssim_torch=0.858954 (100.9959 ms)
sigma=60.0 msssim_tf=0.820477 (670.5424 ms), msssim_torch=0.820475 (103.4489 ms)
sigma=70.0 msssim_tf=0.783511 (671.9357 ms), msssim_torch=0.783507 (113.9048 ms)
sigma=80.0 msssim_tf=0.749522 (672.3925 ms), msssim_torch=0.749518 (120.3891 ms)
sigma=90.0 msssim_tf=0.716221 (672.9066 ms), msssim_torch=0.716217 (118.3788 ms)
sigma=100.0 msssim_tf=0.684958 (675.2075 ms), msssim_torch=0.684953 (117.9481 ms)
Pass
====> Batch
Pass
ssim=1.0000 ssim=0.4225 ssim=0.1924

2. MS_SSIM as loss function

See 'tests/tests_loss.py' for more details about how to use ssim or ms_ssim as loss functions

3. AutoEncoder

See 'tests/ae_example'

results left: the original image, right: the reconstructed image

References

https://github.com/jorge-pessoa/pytorch-msssim
https://ece.uwaterloo.ca/~z70wang/research/ssim/
https://ece.uwaterloo.ca/~z70wang/publications/msssim.pdf
Matlab Code
ssim & ms-ssim from tensorflow

pytorch-msssim's People

Contributors

etienne66 avatar fynnbe avatar horseee avatar iyume avatar lartpang avatar one-sixth avatar trougnouf avatar vainf avatar wxs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-msssim's Issues

Autoencoder only outputs greyscale

When i use SSIM with my autoencoder it produces greyscale images (or very close colors), yet mean square error gives me color. Is there a reason for this, i have tried taking SSIM per channel and i get the same issue.

I am using the loss as:

ssim_loss = SSIM(data_range=1.0, size_average=True, channel=3)

Any suggestions?

When I wanna deal with grayscale image,it report this problem

When I input 2D grayscale image ,i report this.
ValueError: Input images should be 4-d or 5-d tensors, but got torch.Size([256, 256])

Therefore,I change the input dimension to 4D(1,1,256,256),but i got this problem.
RuntimeError: Given groups=1, weight of size [3, 1, 1, 11], expected input[1, 3, 246, 256] to have 1 channels, but got 3 channels instead

Extend the package with newer variants of MS-SSIM

What are your thoughts about extending the library to include the following metrics:

  • 3-SSIM
  • 3-MS-SSIM
  • G-SSIM
  • G-MS-SSIM
  • 3-G-SSIM
  • 3-G-MS-SSIM
  • 4-SSIM
  • 4-MS-SSIM
  • 4-G-SSIM
  • 4-G-MS-SSIM

From a quick look, the required changes are:

  • Decoupling the scalar SSIM calculation from the calculation of the SSIM map (here).
  • Allowing for padding during the SSIM map calculation (here).

An implementation choice that I did not understand is why do you apply multiple 1D gaussian smoothings (here) instead of a 2D or 3D one. Could you please explain it?

What would you think about a collaboration to extend the codebase and improve it?

How to calculate ssim between channels in a tensor ?

The output of our CNN network is a non-negative tensor named D which dimension is [B,4,H,W]. B is batch size. For every sample, the output is a [4,H,W] tensor named Di. We want minimize the image structure similarity between channels of Di, so we define a custom loss function using SSIM. We calculate the SSIM value of each channel to the others , and take the sum as the final loss.

In the beginning, we did not concern about the different of value distribution between each channel, and the code is :

criterionSSIM = ssim.SSIM(data_range=1, channel=4)  //Construct the SSIM criterion
T1 = D.clone().detach()
l1 = T1[:, 0, :, :]
l2 = T1[:, 1, :, :]
l3 = T1[:, 2, :, :]
l4 = T1[:, 3, :, :]
tmp1 = torch.stack([l2, l3, l4, l1], 1)
loss1 = criterionSSIM(fusion_out, tmp1)
tmp2 = torch.stack([l3, l4, l1, l2], 1)
loss2 = criterionSSIM(fusion_out, tmp2)
tmp3 = torch.stack([l4, l1, l2, l3], 1)
loss3 = criterionSSIM(fusion_out, tmp3)
lossSSIM = (loss1+loss2+loss3)

But we found that the SSIM loss go down below zero quickly. To avoid negative SSIM, we normalize every channel of Di to [0, 1], and the code changes to :

criterionSSIM = ssim.SSIM(data_range=1, channel=4)  //Construct the SSIM criterion
B, C, H, W = D.shape
    for b in range(0, B):
    	for c in range(0, C):
        	D[b][c] = D[b][c] / torch.max(D[b][c])       // normalize every channel to [0, 1]
T1 = D.clone().detach()
l1 = T1[:, 0, :, :]
l2 = T1[:, 1, :, :]
l3 = T1[:, 2, :, :]
l4 = T1[:, 3, :, :]
tmp1 = torch.stack([l2, l3, l4, l1], 1)
loss1 = criterionSSIM(fusion_out, tmp1)
tmp2 = torch.stack([l3, l4, l1, l2], 1)
loss2 = criterionSSIM(fusion_out, tmp2)
tmp3 = torch.stack([l4, l1, l2, l3], 1)
loss3 = criterionSSIM(fusion_out, tmp3)
lossSSIM = (loss1+loss2+loss3)

Then the complier reports:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [224, 224]], which is output 0 of SelectBackward, is at version 128; expected version 127 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

We think this error is caused by the normalization action:

  for b in range(0, B):
    	for c in range(0, C):
        	D[b][c] = D[b][c] / torch.max(D[b][c])       // normalize every channel to [0, 1]

But as a rookie, we don’t know how to fix it. I checked out#6934but got no clue. If anybody here can help us, that will be very appreciated and thankful.

size_average=False returns a tensor containing individual SSIM scores instead of returning their sum

Hi, I'm not sure if this behavior is intended or not but for a batch of 64 images, when I set size_average=False, i get a tensor of size 64 with individual values:

tensor([1.0004, 0.9843, 0.9976, 0.9938, 0.9879, 0.9989, 0.9916, 0.9976, 1.0069,
        0.9847, 0.9832, 0.9844, 0.9757, 0.9914, 0.9717, 1.0027, 1.0106, 0.9889,
        0.9885, 0.9949, 0.9996, 0.9965, 0.9801, 0.9887, 0.9879, 0.9804, 0.9926,
        0.9856, 0.9896, 0.9936, 0.9950, 0.9941, 0.9911, 0.9860, 0.9886, 0.9949,
        0.9881, 0.9898, 0.9934, 0.9825, 0.9939, 0.9912, 0.9955, 0.9937, 1.0005,
        0.9975, 0.9831, 1.0005, 0.9970, 0.9953, 0.9855, 1.0001, 1.0101, 0.9862,
        0.9960]

The usual approach in PyTorch is:

size_average (bool, optional) – Deprecated (see reduction). By default, the losses are averaged over each loss element in the batch. Note that for some losses, there are multiple elements per sample. If the field size_average is set to False, the losses are instead summed for each minibatch. Ignored when reduce is False. Default: True

Source: https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html

test ssim_torch and skimage

Hi, I just test pytorch_msssim.ssim and skimage. When I compute ssim value with these two methods on 2d matrix, I got different results, I want to know what the problem is. Maybe I use ssim in a wrong way? Here are my codes and results.

code:
`import torch
from skimage.metrics import structural_similarity
from pytorch_msssim import ssim

m0 = torch.ones(7, 7, dtype=torch.float) # OR matrix
m0[4:7, 0:3] = 0

m_sig = torch.ones(7, 7, dtype=torch.float)
m_sig[4:7, 0] = 0

ssim_out = ssim(m0.reshape(1, 1, 7, 7), m_sig.reshape(1, 1, 7, 7))
print('pytorch_ssim_value = ' + str(ssim_out.numpy()))

ssim_value2 = structural_similarity(m_sig.numpy(), m0.numpy())
print('ssim_value = ' + str(ssim_value2))`

results:
ytorch_ssim_value = 0.9836789
ssim_value = 0.4858374093651142

C++ version

Hi,

Have you some plan to develop this function for c++ api of pytorch? I am interested to contribute.

Thanks

gaussian_filter can reduce some transpose operations.

def gaussian_filter(input, win):
    r""" Blur input with 1-D kernel
    Args:
        input (torch.Tensor):a  batch of tensors to be blured
        window (torch.Tensor): 1-D gauss kernel
    Returns:
        torch.Tensor: blured tensors
    """

    N, C, H, W = input.shape
    out = F.conv2d(input, win, stride=1, padding=0, groups=C)
    # make it contiguous in y direction for memory efficiency
    out = F.conv2d(out, win.transpose(2, 3), stride=1, padding=0, groups=C)
    return out #.contiguous()

Just transpose conv kernel can reduce some ops. Maybe it can be faster.

How to get the Difference image?

So in the original SSIM implementation in skimage.metrics, we can get the diff image which is the actual image differences between the two images. Do we have support to get the diff image in MSSSIM as well?

(feature request) SSIM for time series data/variable size data?

I have data with (batch, 80, sample len) shape, for now, I'm using SSIM with reshaping to (batch, 1, 80, sample len).
But SSIM do not take into account zero padding.

Example:
Predicted
Target

Given huge zero padding, ssim score will be better for shorter train samples.

Unsupported for grayscale image with 1 channel or 2D input?

It seems that whether the class SSIM/MS_SSIM or the function ssim/msssim are all unsupported for the input with 1 channel. When I fed the ssim function with input array of size (1,1,1984,1984), I encountered the following error:

dim' is an invalid keyword argument for squeeze()

What's more, only 4D numpy array is supported, but not for 2D image, which requires expanding dimension first before being processed. This is also not very convenient. Please consider adding some parameters/functions for more versatile usage. Thanks for your efforts!

Checkerboard artifacts

Hi,

I'm getting checkerboard artifacts using MS-SSIM as a reconstruction loss but not while using simple SSIM.

Any pointers on how I can reduce those? A sample image is below.

image

SSIM result is different from skimage.measure.compare_ssim

Hi,
Thanks for this tool. I use both pytorch_mssim.ssim and skimage.measure.compare_ssim to compute ssim, but the results are different. For example, ssim evaluation on an image sequence:
pytorch_msssim.ssim: [0.9655, 0.9500, 0.9324, 0.9229, 0.9191, 0.9154]
skimage.measure.compare_ssim: [0.97794482, 0.96226299, 0.948432, 0.9386946, 0.93113704, 0.92531453]

Why will this happen?

is it ok to have loss bigger than one?

I'm using ssim loss for my autoencoder
seems like the result comes out real nice

but I see that the loss is bigger than one
whereas in the equation, I don't see how it can be bigger than zero

any help or intuition on this?

How to train??

Hi , thanks for your share, I don't know how to use it to train ? My code is here, the loss (ssim_out) is decreasing , but the img2 is not correctly!!.

#import pytorch_ssim
from pytorch_msssim import ssim, ms_ssim, SSIM, MS_SSIM
import torch
from torch.autograd import Variable
from torch import optim
import cv2
import numpy as np

npImg1 = cv2.imread("einstein.png")[:,:,0]
npImg1 = npImg1[:,:, np.newaxis]

img1 = torch.from_numpy(np.rollaxis(npImg1, 2)).float().unsqueeze(0)/255.0
img2 = torch.rand(img1.size())

if torch.cuda.is_available():
    img1 = img1.cuda()
    img2 = img2.cuda()


img1 = Variable( img1,  requires_grad=False)
img2 = Variable( img2, requires_grad = True)

# Functional: pytorch_ssim.ssim(img1, img2, window_size = 11, size_average = True)
#ssim_value = pytorch_ssim.ssim(img1, img2).item() 
ssim_value = ssim(img1, img2).item()
print("Initial ssim:", ssim_value)

# Module: pytorch_ssim.SSIM(window_size = 11, size_average = True)
#ssim_loss = pytorch_ssim.SSIM()
ssim_loss = SSIM(win_size=11, win_sigma=1.5, data_range=255, size_average=True, channel=1)

optimizer = optim.Adam([img2], lr=0.01)

while ssim_value < 1.0:
    optimizer.zero_grad()
    ssim_out = ssim_loss(img1, img2)
    #ssim_out = torch.nn.functional.mse_loss(img1, img2) 
    print(ssim_out.item())
    ssim_out.backward()
    optimizer.step()

    img2_ = (img2 * 255.0).squeeze()
    #np_img2 = img2_.data.numpy().transpose(1,2,0).astype(np.uint8)
    np_img2 = img2_.data.numpy().astype(np.uint8)
    cv2.imwrite("result.jpg", np_img2)
    cv2.imshow("result",np_img2)
    cv2.waitKey(5)
cv2.waitKey(0)

But when I use Other-SSIM, any img1 picture can trained and show correctly

train code here:

import pytorch_ssim
import torch
from torch.autograd import Variable
from torch import optim
import cv2
import numpy as np

npImg1 = cv2.imread("einstein.png")[:,:,0]
npImg1 = npImg1[:,:, np.newaxis]

img1 = torch.from_numpy(np.rollaxis(npImg1, 2)).float().unsqueeze(0)/255.0
img2 = torch.rand(img1.size())

if torch.cuda.is_available():
    img1 = img1.cuda()
    img2 = img2.cuda()


img1 = Variable( img1,  requires_grad=False)
img2 = Variable( img2, requires_grad = True)

# Functional: pytorch_ssim.ssim(img1, img2, window_size = 11, size_average = True)
ssim_value = pytorch_ssim.ssim(img1, img2).item()
print("Initial ssim:", ssim_value)

# Module: pytorch_ssim.SSIM(window_size = 11, size_average = True)
ssim_loss = pytorch_ssim.SSIM()

optimizer = optim.Adam([img2], lr=0.01)

while ssim_value < 0.97:
    optimizer.zero_grad()
    ssim_out = -ssim_loss(img1, img2)
    #ssim_out = -torch.nn.functional.mse_loss(img1, img2)
    ssim_value = - ssim_out.item()
    print(ssim_value)
    ssim_out.backward()
    optimizer.step()

    img2_ = (img2 * 255.0).squeeze()
    #np_img2 = img2_.data.numpy().transpose(1,2,0).astype(np.uint8)
    np_img2 = img2_.data.numpy().astype(np.uint8)
    cv2.imwrite("result.jpg", np_img2)
    cv2.imshow("result",np_img2)
    cv2.waitKey(5)
cv2.waitKey(0)

Apply to 5D tensor?

I am working on medical image. It is 5D tensor, such as BxNxDxHxW. Could you please write a test function for 5D tensor? I hear that we can use view function to convert 5D to 4D and then use your code but I am not sure about speed and how to do it? Thanks

a = torch.randint(0, 255, size=(20, 3, 256, 256), dtype=torch.float32).cuda() / 255.

In additions, we have one bug when the channel is not 3. Let try the script


import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
from pytorch_msssim import *
import torch

s = SSIM(data_range=1.)

a = torch.randint(0, 255, size=(10, 2, 64, 64, 64), dtype=torch.float32).cuda() / 255.
b = a * 0.5
B,c,d,h,w = a.size()
a = a.view(B,c,d,-1)
b = b.view(B,c,d,-1)
a.requires_grad = True
b.requires_grad = True
print(a.size(), b.size())

start_record = torch.cuda.Event(enable_timing=True)
end_record = torch.cuda.Event(enable_timing=True)

start_record.record()
for _ in range(500):
    loss = s(a, b)
    loss.backward()
end_record.record()
torch.cuda.synchronize()

print('cuda time: ', start_record.elapsed_time(end_record)/1000)

Bug is

File "/home/john/pytorch-msssim/pytorch_msssim/ssim.py", line 63, in _ssim
    mu1 = gaussian_filter(X, win)
  File "/home/john/pytorch-msssim/pytorch_msssim/ssim.py", line 34, in gaussian_filter
    out = F.conv2d(input, win, stride=1, padding=0, groups=1)
RuntimeError: Given groups=1, weight of size [3, 1, 1, 11], expected input[2, 2, 64, 4096] to have 1 channels, but got 2 channels instead

negative number error

Some times cs calculated by the _ssim will be negative, and let the result be NaN

Question about calculating MSSSIM

I found your code quite useful in my project. However, lines from 230 to 234 in ssim.py make the list mcs contain only level-1 (M-1) times cs values, while the original equation requires M times values.

MS_SSIM

Hi,

I used MS_SSIM with default win_size=11 and default weights = [0.0448, 0.2856, 0.3001, 0.2363, 0.1333]

my input image size is 64x64

So, I get this error :

RuntimeError: Calculated padded input size per channel: (8 x 8). Kernel size: (1 x 11). Kernel size can't be greater than actual input size

Conversion of Image's Colorspace?

Hi, I came across this while looking for a PyTorch implementation of SSIM. On the Tensorflow page for SSIM it's mentioned that "Note: The true SSIM is only defined on grayscale. This function does not perform any colorspace transform. (If the input is already YUV, then it will compute YUV SSIM average.)"

I was wondering if there are any conversions (either to YUV or to grayscale) that I would have to do to use this. My images are all RGB images.

More faster and less vram occupancy in _ssim function

In my test.

It looks a bit counterintuitive. However, separate operations are indeed more efficient than combining operations.

Use #3 test code.

replace

    mu1, mu2, sigma1_sq, sigma2_sq, sigma12 = (
        concat_out[:, idx*channel:(idx+1)*channel, :, :] for idx in range(5))

to

    mu1, mu2, sigma1_sq, sigma2_sq, sigma12 = torch.chunk(concat_out, 5, 1)

can reduce the running time from 51s to 37s.

replace

    concat_input = torch.cat([X, Y, X*X, Y*Y, X*Y], dim=1)
    concat_win = win.repeat(5, 1, 1, 1).to(X.device, dtype=X.dtype)
    concat_out = gaussian_filter(concat_input, concat_win)
    
    mu1, mu2, sigma1_sq, sigma2_sq, sigma12 = (
        concat_out[:, idx*channel:(idx+1)*channel, :, :] for idx in range(5))

to

    win = win.to(X.device, dtype=X.dtype)
    mu1 = gaussian_filter(X, win)
    mu2 = gaussian_filter(Y, win)
    sigma1_sq = gaussian_filter(X * X, win)
    sigma2_sq = gaussian_filter(Y * Y, win)
    sigma12 = gaussian_filter(X * Y, win)

can reduce the running time from 51s to 36s and reduce vram occupancy from 1G to 733MB.

How to fix data_range in ssim ?

Hi @VainF

I am trying to train my cnn model with ssim loss.

So, I used 2 methods for training:

Method 1:

output_normalized = (output-min_val)/(max_val-min_val)
target_normalized = (target-min_val)/(max_val-min_val)
loss = 100*(1 - ssim(output_normalized, target_normalized,  data_range=1, size_average=True))

Method 2:
loss = 100*(1 - ssim(output-min_val, target-min_val, data_range=max_val-min_val, size_average=True))

-Which method is better for training with ssim ?

-What is better:
to compute data_range for each output of the model
or
to use fixed data_range for all outputs ?

Thanks

Implementing SSMI for 3D autoencoders

Hello
I wanted to try to implement this SSMI for a 3D autoencoder due to the success I have had implementing the loss function in my 2D autoencoder. However, I notice the code is only implemented for 2D images as I get the error: expected stride to be a single integer value or a list of 2 values to match the convolution dimensions, but got stride=[1, 1, 1].

Is there a method where I can implement this loss function for 3D data as well as 2D?

[Feature Request] MaskedSSIM to calculate SSIM on images with valid pixel mask

Great repository! I am working on a deep learning application where about 70% of the pixels in my ground truth target images are masked, because they contain invalid pixels. Is there a way to extend this repository to calculate the SSIM only over valid pixels?

As a first step, it would be really helpful to add the torch argument: reduction = 'none' and have _ssim() return the ssim_map instead of the average across all pixels in the image. It's not perfect, but from there one could calculate an approximate MaskedSSIM by averaging SSIM only over valid pixels.

torch reduction argument for reference:
reduction (str, optional) – Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output, 'sum': the output will be summed. Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: 'mean'

Implementation of MSSSIM + L1 for Greyscale Images

Hi, thanks for the great implementation. I am currently working with greyscale images and implemented the loss for 1 channel images here. Would you mind looking at the implementation and see if it is accurate?

Thank you.

The value of SSIM is calculated improperly

I have 3 images: the inference, img_1, and img_2.

I use this code to calculate SSIM value:

from pytorch_msssim import ssim
ssim_value = ssim(reference.unsqueeze(0), img_1.unsqueeze(0), data_range=255)
output_img = torch.cat((reference, img_1), dim=2)
plt.figure(figsize=(8, 4))
plt.imshow(output_img.numpy().transpose(1, 2, 0))
plt.axis('off')
plt.title(f'ssim = {ssim_value}')
plt.show()
ssim_value = ssim(reference.unsqueeze(0), img_2.unsqueeze(0), data_range=255)
output_img = torch.cat((reference, img_2), dim=2)
plt.figure(figsize=(8, 4))
plt.imshow(output_img.numpy().transpose(1, 2, 0))
plt.axis('off')
plt.title(f'ssim = {ssim_value}')
plt.show()

But, results are unexpected...
image
image

NaN with ms_ssim

ms_ssim function outputs NaN when the input images are anti-correlated, i.e. when ssim would output values between -1 and 0.

Example:

X = torch.rand(3,1,190,190)
Y = 1 - X

print(ssim( X, Y, data_range=1, size_average=False))
print(ms_ssim( X, Y, data_range=1, size_average=False))

tensor([-0.9664, -0.9654, -0.9649])
tensor([nan, nan, nan])

ms_ssim calculation

first of all, thank you for sharing it with us!
while reading your code I've noticed in ssim.py line 195-6:

msssim_val = torch.prod((mcs[:-1] ** weights[:-1].unsqueeze(1)) * (ssim_val ** weights[-1]), dim=0) # (batch, )

if by definition:
Screen Shot 2019-12-30 at 14 23 37

as far as I understand, the ssim result is the product of l_m(p) and cs(p), it seems that your code will calculate:
Screen Shot 2019-12-30 at 15 05 19
I think that the correct code would be:
msssim_val = torch.prod((mcs ** weights.unsqueeze(1)), dim=0) * (ssim_val / cs) # (batch, )

Best way to deal with small images with msssim

I have a network that trains 64x64px images. I can't currently use MSSSIM as a loss function as the number of downsamples means I need a larger input image size as suggested by the error message,

Image size should be larger than 160 due to the 4 downsamplings in ms-ssim

What is the best way to deal with this? I have to keep the 64x64px input image size for various reasons. My immediate thoughts are to pad the input images with zero (or 0.5???) up to 160x160 prior to calculating the loss?

Is this a legitimate way to go?

Steve

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.