Code Monkey home page Code Monkey logo

Comments (5)

cfzd avatar cfzd commented on May 29, 2024

@Durobert
The values of 1640./25 and 32./534*590 are the offset of test-time-augmentation (TTA).

During TTA, we would first shift the image, and get the prediction of the shifted image. Then we inverse-shift the prediction of the shifted image to get the correct prediction. In this way, a TTA is finished.

For example, if we shift the image to the left for x pixels, then the predicted coordinates should add x as well. The difference is that we shift the image in the strided feature map. If the feature map's width is 25, then we shift the feature by 1 pixel means image width * 1 / 25 pixels in the original image space, which is the derivation of 1640./25 (1640 is the image width on CULane).

The values of 32./534*590 is similar, but this part contains a crop operation.

from ultra-fast-lane-detection-v2.

Durobert avatar Durobert commented on May 29, 2024

@cfzd
For the CULane, you resize the image to 1600*320, I use the backbone resnet18, The downsampling multiple is 32,so the feature map's width is 1600/32=50,the value is 1640./50, is right?

from ultra-fast-lane-detection-v2.

cfzd avatar cfzd commented on May 29, 2024

@Durobert
It should be correct. In fact, another interesting point is that: if you always do TTA both in the opposite directions with the same shift, you can directly average the shifted predictions together without offset and get the correct results. Since (pred - offset) + (pred + offset) = 2*pred.

from ultra-fast-lane-detection-v2.

Durobert avatar Durobert commented on May 29, 2024

@cfzd
Another problem, about the value 32./534 * 590,For the CULane,the crop_ratio is 0.6,so the resized image height is 320/0.6=534, the value 32./534 * 590 means the croped image height is 32, is right?If I don't crop, the value is 0?

from ultra-fast-lane-detection-v2.

cfzd avatar cfzd commented on May 29, 2024

@Durobert
The offset with the crop operation is a little tricky, and sorry I have forgotten the derivation details. However, the core idea is the same, and it is just to make sure the shift prediction is correct.

If you don't crop, it is the same as the logic of 1640./25. For example, suppose the height of the feature map is hf, the height of original image is hi, the number of shifted pixels is x, then the offset is: x/hf * hi.

from ultra-fast-lane-detection-v2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.