Comments (13)
do you mean, like 5-crop and 10-crop?
from vision.
Hi, yes. It can be implemented by using a custom transform for the input and the class. I'm trying to do one. Hopefully it will be nice enough to be PR
from vision.
@edouardoyallon have you done an implementation of multi-crop? If so where can I find it ? Many thanks.
from vision.
Has anyone done the multi-crop implementation yet?
from vision.
I can implement this but I have a few comments/questions:
- I think we should only implement 5-crop as the 10-crop version is the flipped 5-crops and after #240 is merged this will be trivial to do with just the 5-crop version.
- should we call it
Oversample
following the caffe standard? Or do you have a better name? - This transform wont be useful for training (at least easily) as if your transform emits multiple images/tensors you would need to have a custom collate function to put them all into a batch as far as I can tell. I think the same problem occurs if you want to batch these 5/10 crops for fast evaluation using the pytorch DataLoader?
from vision.
About your points:
- While I agree that it would be a simple extension to have the 10-crop from the 5-crop, I think that the 10-crop is common enough so that having the 10 crops would be the most used case, so I'd rather keep it.
oversample
is not a representative name to me. Infb-resnet-torch
, they called itTenCrop
, what do you think? Or maybeMultiCrop
?- I believe this transform is only used for testing, in order to improve results during evaluation, so I think this shouldn't be a concern. We could eventually just concatenate over the 1st dimension, for example, but that might indeed need a different
collate
function.
from vision.
1 - makes sense
2 - I think FiveCrop
and TenCrop
are pretty good, MultiCrop
wouldn't accept any other number so might be misleading?
3 - yeah I think even for testing you will need a special collate function (unless you want to test one image at a time (10-cropped)).
Ok great, I think that's enough to go on. Will send a PR soon.
from vision.
This has now been merged into master
from vision.
Could anyone elaborate on how are FiveCrop
and TenCrop
supposed to be used? It seems they return plain Python list
, which fails ToTensor as it is neither an ndarray
nor PIL.Image
.
from vision.
@wlnirvana for the moment, the plug with ToTensor
doesn't work, and you need to manually call ToTensor
for each of the outputs, following an approach similar to the one described in #230 . We will discuss about a better way of doing this, because it doesn't fit nicely with the dataloader either.
from vision.
@fmassa I see. Thanks for the quick response.
from vision.
cc @alykhantejani so that we discuss on this this week
from vision.
@wlnirvana so, we discussed with @alykhantejani and we think the best in this case is to use the functional interface for that. We reached the conclusion that you need anyway to do some post-processing of the results because you will want to average/max the results over the crops.
Here is a snippet that illustrates how you can do it:
def my_transform(img):
# do any transforms you want here
imgs = ten_crop(img) # this is a list of PIL Images
return torch.stack([normalize(to_tensor(x)) for x in imgs], 0) # returns a 4D tensor
Now in your test loop, you obtain batch
from your data loader, and you'd do something like
input, target = batch
# input is a 5d tensor
bs, ncrops, c, h, w = input.size()
result = model(input.view(-1, c, h, w)) # fuse batch size and ncrops
# say result is a 2D tensor
result_averaged = result.view(bs, ncrops, -1).mean(1) # avg over crops
from vision.
Related Issues (20)
- Fast rotation for 90, 180, and 270 degree HOT 3
- ileURLRetrievalError: Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator. HOT 1
- Torchvision C++ CUDA Building Errors
- RegionProposalNetwork can't be AOTInductor compiled with dynamic batch size HOT 3
- Error when training ResNet on references/classification/train.py HOT 1
- "ValueError: Could not find the operator torchvision::nms" upon importing torchvision HOT 4
- Add Jpeg Compression augmentation HOT 5
- All CI job are failing
- Add 3.12 CI jobs HOT 3
- Cleaner SanitizeBoundingBoxes transform HOT 6
- Densenet pretrained model - “unexpected EOF, expected 899672 more bytes. The file might be corrupted.“) HOT 2
- Allow empty class folders in ImageFolder
- Support reading WebP and HEIC image formats natively
- inference gives arbitrary output from trained model on `np.ones` HOT 1
- aarch64 build for AWS Linux - Failed to load image Python extension HOT 6
- Building on archlinux fails with include error HOT 8
- position, colour, and background colour of text labels in draw_bounding_boxes HOT 1
- Release 2.3 branch cut and apply release only changes
- Release 2.3 create release notes for vision, audio and text HOT 1
- RuntimeError: The size of tensor a (14) must match the size of tensor b (6) at non-singleton dimension 0 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vision.