Code Monkey home page Code Monkey logo

Comments (12)

fmassa avatar fmassa commented on May 27, 2024

I'd say as long as the returned tensor is properly converted to float and scaled to [0,1], things should be fine.
But we need to check if standard image transforms (like rotating, cropping, etc) work ok in PIL for int16 type.

Also, LongTensor is actually int64, you might be looking for a ShortTensor instead (which is signed).

from vision.

bodokaiser avatar bodokaiser commented on May 27, 2024

Thats the problem. I put a minimal example together where you can examine the problem.

from torchvision.transforms import Compose, ToPILImage, ToTensor
from matplotlib import pyplot as plt

import skimage.io
import numpy as np

img = skimage.io.imread('mr.tif')
print('img', img.shape, img.dtype)

plt.imshow(img)
plt.show()

transform = Compose([
    ToPILImage(),
    ToTensor(),
])

timg = transform(np.expand_dims(img, 2))

plt.imshow(timg[0].numpy())
plt.show()

img1
img2

here you find corresponding the tiff file.
mr.tif.zip

from vision.

fmassa avatar fmassa commented on May 27, 2024

The example that you mentioned shows that the current code is not adapted to int16 images, or did you try adding the modifications you mentioned?

from vision.

bodokaiser avatar bodokaiser commented on May 27, 2024

updated:

# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
  nchannel = 3
else:
  nchannel = len(pic.mode)
# handle PIL Image
buf = pic.tobytes()
if len(buf) > pic.width * pic.height * nchannel:
  img = torch.ShortTensor(torch.ShortStorage.from_buffer(buf, 'native'))
else:
  img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.view(pic.size[1], pic.size[0], nchannel)

fails with RuntimeError: size '[466 x 394 x 1]' is invalid for input of with 367208 elements at /private/var/folders/y0/d4npmpd50971gpgqxtsvc25m0000gn/T/pip-_fraocf5-build/torch/lib/TH/THStorage.c:59 however changing to:

if len(buf) > pic.width * pic.height * nchannel:
  img = torch.ShortTensor(np.fromstring(buf, dtype=np.int16)[0::2])
else:
  img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.contigouos().view(pic.size[1], pic.size[0], nchannel)

does the job. Furthermore we need to change:

class ToPILImage(object):
    """Converts a torch.*Tensor of range [0, 1] and shape C x H x W
    or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C
    to a PIL.Image of range [0, 255]
    """

    def __call__(self, pic):
        npimg = pic
        mode = None
        if not isinstance(npimg, np.ndarray):
            npimg = pic.mul(255).byte().numpy()
            npimg = np.transpose(npimg, (1, 2, 0))

        if npimg.shape[2] == 1:
            npimg = npimg[:, :, 0]
            if npimg.dtype != np.int16:
                mode = "L"

        return Image.fromarray(npimg, mode=mode)

This works but is of course just a quick hack.

from vision.

fmassa avatar fmassa commented on May 27, 2024

Ok, cool.
But I was wondering, does PIL supports natively image operations in int16 image, such as rotate or crop? If it doesn't, then even if we adapt ToPILImage and ToTensor, we still won't be able to perform these operations. Also, as ToTensor converts the image to float, there would be no way of knowing if the original image was int16 or uint8, meaning that applying ToTensor() followed by ToPILImage would not return the identity.

from vision.

bodokaiser avatar bodokaiser commented on May 27, 2024

According to this issue and this PR it does only for grayscale images.

Regarding the behavior of ToTensor() one way to solve this would be that ToTensor() keeps the data type from PIL.Image but can take the target data type as argument.
Alternatively we could also ignore the fact that ToPILImage(ToTensor()) does not return the identity then we would have no API breaks and I also do do think we loose anything through this?

from vision.

soumith avatar soumith commented on May 27, 2024

Bodo, it looks like you've been making a lot of progress already.

If you want to fire a few PRs to make torchvision work with int16 out of the box, I would love to have them. If not, I will eventually get to this for sure.

from vision.

bodokaiser avatar bodokaiser commented on May 27, 2024

from vision.

soumith avatar soumith commented on May 27, 2024

0 to 65 sounds fine for int16/uint16. You can remove image scaling if you want too, I dont have experience with this domain, so I'll let you make a call.

In the case of identity preservation, ToPILImage needs to take a kwarg of Int16=True or something for the identity loop to happen. I dont see a better way. Same for ToTensor, taking the target data type as a kwarg seems good.

from vision.

bodokaiser avatar bodokaiser commented on May 27, 2024

@soumith @fmassa PR #122 is up for discussion!

from vision.

alykhantejani avatar alykhantejani commented on May 27, 2024

@fmassa I think this can now be closed as #122 was merged.

from vision.

fmassa avatar fmassa commented on May 27, 2024

Thanks @alykhantejani !

from vision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.