Comments (12)
I'd say as long as the returned tensor is properly converted to float and scaled to [0,1]
, things should be fine.
But we need to check if standard image transforms (like rotating, cropping, etc) work ok in PIL for int16 type.
Also, LongTensor
is actually int64
, you might be looking for a ShortTensor
instead (which is signed).
from vision.
Thats the problem. I put a minimal example together where you can examine the problem.
from torchvision.transforms import Compose, ToPILImage, ToTensor
from matplotlib import pyplot as plt
import skimage.io
import numpy as np
img = skimage.io.imread('mr.tif')
print('img', img.shape, img.dtype)
plt.imshow(img)
plt.show()
transform = Compose([
ToPILImage(),
ToTensor(),
])
timg = transform(np.expand_dims(img, 2))
plt.imshow(timg[0].numpy())
plt.show()
here you find corresponding the tiff file.
mr.tif.zip
from vision.
The example that you mentioned shows that the current code is not adapted to int16 images, or did you try adding the modifications you mentioned?
from vision.
updated:
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
else:
nchannel = len(pic.mode)
# handle PIL Image
buf = pic.tobytes()
if len(buf) > pic.width * pic.height * nchannel:
img = torch.ShortTensor(torch.ShortStorage.from_buffer(buf, 'native'))
else:
img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.view(pic.size[1], pic.size[0], nchannel)
fails with RuntimeError: size '[466 x 394 x 1]' is invalid for input of with 367208 elements at /private/var/folders/y0/d4npmpd50971gpgqxtsvc25m0000gn/T/pip-_fraocf5-build/torch/lib/TH/THStorage.c:59
however changing to:
if len(buf) > pic.width * pic.height * nchannel:
img = torch.ShortTensor(np.fromstring(buf, dtype=np.int16)[0::2])
else:
img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
img = img.contigouos().view(pic.size[1], pic.size[0], nchannel)
does the job. Furthermore we need to change:
class ToPILImage(object):
"""Converts a torch.*Tensor of range [0, 1] and shape C x H x W
or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C
to a PIL.Image of range [0, 255]
"""
def __call__(self, pic):
npimg = pic
mode = None
if not isinstance(npimg, np.ndarray):
npimg = pic.mul(255).byte().numpy()
npimg = np.transpose(npimg, (1, 2, 0))
if npimg.shape[2] == 1:
npimg = npimg[:, :, 0]
if npimg.dtype != np.int16:
mode = "L"
return Image.fromarray(npimg, mode=mode)
This works but is of course just a quick hack.
from vision.
Ok, cool.
But I was wondering, does PIL supports natively image operations in int16 image, such as rotate
or crop
? If it doesn't, then even if we adapt ToPILImage
and ToTensor
, we still won't be able to perform these operations. Also, as ToTensor
converts the image to float
, there would be no way of knowing if the original image was int16
or uint8
, meaning that applying ToTensor()
followed by ToPILImage
would not return the identity.
from vision.
According to this issue and this PR it does only for grayscale images.
Regarding the behavior of ToTensor()
one way to solve this would be that ToTensor()
keeps the data type from PIL.Image
but can take the target data type as argument.
Alternatively we could also ignore the fact that ToPILImage(ToTensor())
does not return the identity then we would have no API breaks and I also do do think we loose anything through this?
from vision.
Bodo, it looks like you've been making a lot of progress already.
If you want to fire a few PRs to make torchvision work with int16 out of the box, I would love to have them. If not, I will eventually get to this for sure.
from vision.
from vision.
0 to 65 sounds fine for int16/uint16. You can remove image scaling if you want too, I dont have experience with this domain, so I'll let you make a call.
In the case of identity preservation, ToPILImage needs to take a kwarg of Int16=True or something for the identity loop to happen. I dont see a better way. Same for ToTensor, taking the target data type as a kwarg seems good.
from vision.
@soumith @fmassa PR #122 is up for discussion!
from vision.
@fmassa I think this can now be closed as #122 was merged.
from vision.
Thanks @alykhantejani !
from vision.
Related Issues (20)
- Any operation on loaded image segfaults since 0.17.1 on Mac HOT 2
- Image scaling is performed incorrectly (in all detection models!) HOT 5
- IndexError: index 168 is out of bounds for dimension 0 with size 168 in keypointrcnn_loss HOT 1
- Nightly build flaky pytorch/vision / conda-py3_11-cpu builds HOT 1
- AVX512 support machine cannot resize uint8 image with BILINEAR interpolation as it is
- RuntimeError/AssertionError when finetuning fasterrcnn_resnet50_fpn on visdrone dataset HOT 3
- Mypy job is broken
- Regarding IMAGENET1K_V1 and IMAGENET1K_V2 weights
- Compiling resize_image: function interpolate not_implemented HOT 1
- AttributeError: module 'torchvision.transforms' has no attribute 'v2' HOT 1
- Run all torchvision models in one script. HOT 1
- Build fails: error: unknown type name 'j_decompress_ptr' HOT 3
- Differences in CPU vs CUDA resize for uint8 images HOT 2
- Enable Video models for other tasks
- Can't use gaussian_blur if sigma is a tensor on gpu HOT 2
- Mask r-cnn training runs infinitely without output or error HOT 1
- detection AnchorGenerator Source code issues HOT 1
- Video Reader's get_metadata function fails on videos with sound
- Difficulty building on macOS HOT 3
- -
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vision.