Code Monkey home page Code Monkey logo

Comments (10)

BichenWuUCB avatar BichenWuUCB commented on July 22, 2024

@ByeonghakYim Thanks for your question. I'm also curious to see how the IOU-thresholding based box matching works for you. Could you try to explain a bit more on what you mean by

I have tried your method but the loss is not converge after 3 when upper method gets near 1 in the end of the training. And it miss a lot(about 40%).

Thanks.

from squeezedet.

bhyim516 avatar bhyim516 commented on July 22, 2024

I followed your method that described in your paper
"a ground truth bounding box. During training, we com- pare ground truth bounding boxes with all anchors and as- sign them to the anchors that have the largest overlap (IOU) with each of them. The reason is that we want to select the “closest” anchor to match the ground truth box such that the transformation needed is reduced to minimum. Iijk evalu- ates to 1 if the k-th anchor at position-(i, j) has the largest overlap with a ground truth box, and to 0 if no ground truth is assigned to it. This way, we only include the loss gener- ated by the “responsible” anchors. As there can be multiple objects per image, we normalize the loss by dividing it by the number of objects."
so that each ground truth has one anchor. but I could find that some of them has under 0.1 top IOU and this leads big loss convergence(minimum loss is near 3 which is bigger than other box matching method that loss is near 1). And I also found that your anchors from KITTI bbox distribution, but I think they are good for KITTI, not general case.
I assigned the anchors as
(30,30), (20,40), (17,50), (40,20), (17,50)
(60,60), (40,80), (35,100), (80, 40), (100, 35)
(120,120)...
(200,200)...
(300,300)... , for more general case.

Thanks.

from squeezedet.

andreapiso avatar andreapiso commented on July 22, 2024

from squeezedet.

bhyim516 avatar bhyim516 commented on July 22, 2024

@AndreaPisoni Yes, it will perform well for KITTI, but it is not good for in general case. I think anchor boxes should be simple with a lot of data and I'm considering more general case. Thanks for your comment

from squeezedet.

BichenWuUCB avatar BichenWuUCB commented on July 22, 2024

@ByeonghakYim

so that each ground truth has one anchor. but I could find that some of them has under 0.1 top IOU

I wonder what is the reason why the top matched anchor only has an IOU of 0.1 with the ground truth. I can think of the following reasons:

  • Not enough anchor shapes to match the ground truth.
  • Anchor density (spatially) is not enough to match with small objects
  • A lot of objects of similar shapes appears at the same location, such that at one position, there are not enough anchors to match them.

In your case, does it fall into any of the above situation?

from squeezedet.

ByeonghakYim avatar ByeonghakYim commented on July 22, 2024

@BichenWuUCB
Thanks, there was some mistakes and I solved that problem.
But I've got another question.
There can be one anchor box has multiple ground truth matching.
In this case, how do you propagate the loss to that anchor box during the backpropagation?

from squeezedet.

BichenWuUCB avatar BichenWuUCB commented on July 22, 2024

An anchor is not going to be matched with multiple ground truth boxes. At this line and below, you can see how this is handled.

from squeezedet.

ByeonghakYim avatar ByeonghakYim commented on July 22, 2024

@BichenWuUCB
Thanks.
This part should be prevention of the issue.

if ov_idx not in aidx_set:
aidx_set.add(ov_idx)
aidx = ov_idx
if mc.DEBUG_MODE:
max_iou = max(overlaps[ov_idx], max_iou)
min_iou = min(overlaps[ov_idx], min_iou)
avg_ious += overlaps[ov_idx]
num_objects += 1
break

I have one more question.
I'm sorry for many question.
I think if there are many overlapped ground truth, some of them cannot optimal anchor matching and it will lead to very big localization offset.
I could find you use minimum distance to match boxes if there are only 0 IOU and I think this also will lead same problem.
I was just wondering if it is not a problem.
You have much more experience on object detection than me and I would like to know your opinion.
Thanks.

from squeezedet.

bayesian-mind avatar bayesian-mind commented on July 22, 2024

@BichenWuUCB
Thanks, there was some mistakes and I solved that problem.
But I've got another question.
There can be one anchor box has multiple ground truth matching.
In this case, how do you propagate the loss to that anchor box during the backpropagation?

@ByeonghakYim How did you end up solving your issue?

from squeezedet.

peek1999 avatar peek1999 commented on July 22, 2024

@BichenWuUCB
Thanks, there was some mistakes and I solved that problem.
But I've got another question.
There can be one anchor box has multiple ground truth matching.
In this case, how do you propagate the loss to that anchor box during the backpropagation?

@ByeonghakYim How did you end up solving your issue?

one anchor box is not supposed to match with multiple ground truth images in an image. Since one anchor box corresponds to only one model prediction, it should be matched with only one ground truth box. You can either match it to anyone out of the possible matches or use the ground truth box that has the highest IoU.

from squeezedet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.