Code Monkey home page Code Monkey logo

Comments (13)

soumith avatar soumith commented on May 27, 2024

do you mean, like 5-crop and 10-crop?

from vision.

edouardoyallon avatar edouardoyallon commented on May 27, 2024

Hi, yes. It can be implemented by using a custom transform for the input and the class. I'm trying to do one. Hopefully it will be nice enough to be PR

from vision.

catalystfrank avatar catalystfrank commented on May 27, 2024

@edouardoyallon have you done an implementation of multi-crop? If so where can I find it ? Many thanks.

from vision.

penguinshin avatar penguinshin commented on May 27, 2024

Has anyone done the multi-crop implementation yet?

from vision.

alykhantejani avatar alykhantejani commented on May 27, 2024

I can implement this but I have a few comments/questions:

  1. I think we should only implement 5-crop as the 10-crop version is the flipped 5-crops and after #240 is merged this will be trivial to do with just the 5-crop version.
  2. should we call it Oversample following the caffe standard? Or do you have a better name?
  3. This transform wont be useful for training (at least easily) as if your transform emits multiple images/tensors you would need to have a custom collate function to put them all into a batch as far as I can tell. I think the same problem occurs if you want to batch these 5/10 crops for fast evaluation using the pytorch DataLoader?

from vision.

fmassa avatar fmassa commented on May 27, 2024

About your points:

  1. While I agree that it would be a simple extension to have the 10-crop from the 5-crop, I think that the 10-crop is common enough so that having the 10 crops would be the most used case, so I'd rather keep it.
  2. oversample is not a representative name to me. In fb-resnet-torch, they called it TenCrop, what do you think? Or maybe MultiCrop?
  3. I believe this transform is only used for testing, in order to improve results during evaluation, so I think this shouldn't be a concern. We could eventually just concatenate over the 1st dimension, for example, but that might indeed need a different collate function.

from vision.

alykhantejani avatar alykhantejani commented on May 27, 2024

1 - makes sense
2 - I think FiveCrop and TenCrop are pretty good, MultiCrop wouldn't accept any other number so might be misleading?
3 - yeah I think even for testing you will need a special collate function (unless you want to test one image at a time (10-cropped)).

Ok great, I think that's enough to go on. Will send a PR soon.

from vision.

alykhantejani avatar alykhantejani commented on May 27, 2024

This has now been merged into master

from vision.

wlnirvana avatar wlnirvana commented on May 27, 2024

Could anyone elaborate on how are FiveCrop and TenCrop supposed to be used? It seems they return plain Python list, which fails ToTensor as it is neither an ndarray nor PIL.Image.

from vision.

fmassa avatar fmassa commented on May 27, 2024

@wlnirvana for the moment, the plug with ToTensor doesn't work, and you need to manually call ToTensor for each of the outputs, following an approach similar to the one described in #230 . We will discuss about a better way of doing this, because it doesn't fit nicely with the dataloader either.

from vision.

wlnirvana avatar wlnirvana commented on May 27, 2024

@fmassa I see. Thanks for the quick response.

from vision.

fmassa avatar fmassa commented on May 27, 2024

cc @alykhantejani so that we discuss on this this week

from vision.

fmassa avatar fmassa commented on May 27, 2024

@wlnirvana so, we discussed with @alykhantejani and we think the best in this case is to use the functional interface for that. We reached the conclusion that you need anyway to do some post-processing of the results because you will want to average/max the results over the crops.
Here is a snippet that illustrates how you can do it:

def my_transform(img):
  # do any transforms you want here
  imgs = ten_crop(img)  # this is a list of PIL Images
  return torch.stack([normalize(to_tensor(x)) for x in imgs], 0) # returns a 4D tensor

Now in your test loop, you obtain batch from your data loader, and you'd do something like

input, target = batch
# input is a 5d tensor
bs, ncrops, c, h, w = input.size()
result = model(input.view(-1, c, h, w)) # fuse batch size and ncrops
# say result is a 2D tensor
result_averaged = result.view(bs, ncrops, -1).mean(1) # avg over crops

from vision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.