Code Monkey home page Code Monkey logo

Comments (8)

NicolasHug avatar NicolasHug commented on September 27, 2024 1

Hi @sidijju ,
We'll have to make sure the v2 version is still compatible with v1. However, there should be an easy workaround for the issue you're seeing - e.g. perhaps setting check_v1_compatibility=(size is not None)? Since this is a new feature that v1 doesn't have, we shouldn't need to check for v1 compat for these parameter values anyway.

I'm not 100% sure what the best solution is right now though, do you mind opening a PR so I can take a better look? It's totally OK if the PR isn't completely ready just yet (you can e.g. set it as "draft")

from vision.

sidijju avatar sidijju commented on September 27, 2024 1

@NicolasHug Just made a PR!

from vision.

NicolasHug avatar NicolasHug commented on September 27, 2024 1

Closed by #8459, thank you @sidijju !

Thanks for the feedback @MyNameIsFu , this will be available soon in the next version (0.19)

from vision.

NicolasHug avatar NicolasHug commented on September 27, 2024

Thanks for the feature request @sidijju .

In order to get [500, 250] and [250, 500] from these specific input images, you could set size=499, max_size=500. But of course, this isn't a great UX, and it might not be possible to find a size value that would satisfy all input images.

There's been discussion of adding an edge parameter in the past but the parameters of resize are already fairly confusing. It seems that if we were to allow size=None, max_size=500 we could implement the behaviour you are looking for and this should cover all of the potential use cases:

  • size=tuple -> resize to fixed size
  • size=int -> resize shorter edge to size while preserving aspect ratio
  • size=int, max_size=int -> try to resize shorter edge to size while preserving aspect ratio but if resulting longer edge exceeds max_size, then scale down. This corresponds to the resizing strategy of some detection models.
  • size=None, max_size -> resize longer edge to max_size while preserving aspect ratio.

The first 3 are already implemented, the last one isn't. Any thoughts @pmeier @vfdev-5 ?

from vision.

sidijju avatar sidijju commented on September 27, 2024

Thanks for the reply @NicolasHug! I agree that setting size=499, max_size=500 would work for this set of input images, but I'm not sure about the effect this would have on a more varied dataset. I also agree that it isn't the best UX since it's not very intuitive.

I think the proposal for a size=None option is a good stopgap for now. If others agree and I have some guidance from more experienced contributors, I can attempt to implement this feature.

from vision.

NicolasHug avatar NicolasHug commented on September 27, 2024

Thanks for your feedback @sidijju . I'm happy to review a PR from you if you'd like to try to submit one. Our contributing guide is here: https://github.com/pytorch/vision/blob/main/CONTRIBUTING.md

For this specific change you'd only need to update torchvision.transforms.v2 (transform class, PIL functional and tensor functional). No need to change the "v1" transforms, i.e. the stuff in torchvision.transforms

from vision.

sidijju avatar sidijju commented on September 27, 2024

Hey @NicolasHug! Quick question for you regarding the development of this feature - would it be acceptable to make the new v2 Resize no longer compatible with the v1 Resize (by removing the _v1_transform_cls attribute from v2.Resize)?

Some of the current unit tests check for v1 compatibility, and this is running into issues since the v1 Resize doesn't accept size=None as an argument. In this scenario, we could implement a way to extract parameters for the v1 version of the transform, as other v2 transforms have done, but the problem with this is that there isn't a clean way to provide a value for the v1 size that will closely approximate the result of the v2 size = None, max_size = x.

Right now, I've implemented a workaround that sets size to max_size - 1 in these scenarios (which passes all the unit tests) but I'd appreciate your opinion on this before I make a PR.

from vision.

MyNameIsFu avatar MyNameIsFu commented on September 27, 2024

Using the current implementation (if using the workaround with size = max_size-1 ) on different aspect ratios results in different behaviours for shapes (n, m)
n < m --> (new_n, max_size)
n == m --> (max_size - 1, max_size -1)
which crashes my pipeline for mixed datasets.
Your PR would fix this issue so thanks and i'm looking forward for the merge!

from vision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.