When analyze coco2017,this code ap is a little higher than original ap? about coco-analyze HOT 12 CLOSED

matteorr commented on June 26, 2024

When analyze coco2017,this code ap is a little higher than original ap?

from coco-analyze.

Comments (12)

matteorr commented on June 26, 2024 3

As I suspected in the previous post, the culprit is the different definition of area ranges during evaluation.

When wrapping the COCOeval.evaluate() function I pass the parameters from the COCOanalyze class:

self.cocoEval.params.areaRng    = self.params.areaRng
self.cocoEval.params.areaRngLbl = self.params.areaRngLbl
self.cocoEval.params.maxDets    = self.params.maxDets
self.cocoEval.params.iouThrs    = sorted(self.params.oksThrs)

These values are initialized in the Params class and my default values for the area ranges are different from the values that are defined in the original cocoeval repo.

Specifically, since COCO keypoints don't have small instances I believe that the all area range should not be defined to include annotations with less than 32**2 pixels. That's why I defined the all area range as [32 ** 2, 1e5 ** 2]. Conversely, in the coco repo they define the all area range for keypoints as exactly the same one used for bbox and segm, so [0 ** 2, 1e5 ** 2].

The definition discrepancy results in the fact that the number of ground truth instances counted is different for the two evaluations, resulting in different AP (higher for cocoanalyze since it considers less instances present), while obviously recall is not affected by that.

I think my solution makes more sense and reached out about it in the past, but they didn't change their code. You can easily choose the one you prefer though, by just changing the default param values in the Params class, or by overwriting them after you instantiated an object of class COCOanalyze and just accessing its params, i.e.:

coco_analyze = COCOanalyze(coco_gt, coco_dt, 'keypoints')
coco_analyze.params.areaRng = [[0 ** 2, 1e5 ** 2], [96 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2]]

After this change the results will match. I'm closing this issue, please put a thumbs up if you think its solved, or feel free to reopen if you have further comments.

from coco-analyze.

baolinhu commented on June 26, 2024 1

Yeah，I agree with your conclusion.Thanks for your patience.

from coco-analyze.

matteorr commented on June 26, 2024

Hi @baolinhu, could you please be more specific?
It would be useful if you could post the values that you're referring to so I can double check. Also, the datasets are slightly different so I wouldn't be too surprised if that were the case.

from coco-analyze.

baolinhu commented on June 26, 2024

@matteorr Thanks for reply.I use this code to analyze coco2017val.Use the same prediction result JSON file.

This code output
Use other local evaluation code or use an online server

from coco-analyze.

Yishun99 commented on June 26, 2024

Same question, 0.7 point on mAP.

from coco-analyze.

matteorr commented on June 26, 2024

@DouYishun, thanks for adding info. I assume you mean 0.007?

Right now, the only results that seem affected are AP@IoU=0.5:0.95 | area=all (.718 instead of .711), and AP@IoU=0.75 | area=all (.796 instead of .789). Recall is not affected.

What is surprising is that the overall AP on medium and large instances is the same, but when using area=all there is a difference. This makes me think the problem might have to do with a possible difference in the annotation of small objects (which are not considered in the COCO Keypoints, but my eval code might be looking at how many small objects there are when computing precision).

I'm currently looking into it and will post updates as soon as possible.

from coco-analyze.

Yishun99 commented on June 26, 2024

@matteorr Yes, I mean 0.007%.
Here is my results

It should be:

AP@OKS=0.5:0.95, AP@OKS=0.5 and AP@OKS=0.75 are affected.

from coco-analyze.

baolinhu commented on June 26, 2024

@matteorr Thanks. I know your mean.But I still don't fully understand.Since COCO keypoints don't have small instances,the number of ground truth instances should be the same.Because the number of ground truth instances whose areas less than 32**2 pixels should be 0.So [0 ** 2, 1e5 ** 2] is equivalent to [32 ** 2, 1e5 ** 2].I think it will affect the number of FP(False Positive samples).

from coco-analyze.

matteorr commented on June 26, 2024

@baolinhu - With the number of ground truth instances counted is different for the two evaluations I really meant the number of ground truth instances that are matched to a detection. Setting the area range to a different value will also determine which detections to ignore.

So in this particular case, detections smaller than 32**2 will be ignored by my evaluation code. To convince yourself, try removing all the small detections before loading them in the COCOanalyze class. I.e.:

new_team_split_dts = [d for d in team_split_dts if d['area']>32**2]
coco_gt = COCO( annFile )
coco_dt   = coco_gt.loadRes( new_team_split_dts )
coco_analyze = COCOanalyze(coco_gt, coco_dt, 'keypoints')
coco_analyze.evaluate(verbose=True, makeplots=False, savedir=saveDir, team_name=teamName)

You'll see that the results in this case are exactly the same regardless of the value coco_analyze.params.areaRng[0] being [0 ** 2, 1e5 ** 2] or [32 ** 2, 1e5 ** 2].

This makes sense to me. But if you still don't agree please post back, maybe I am missing your point.

from coco-analyze.

baolinhu commented on June 26, 2024

Firstly, You solved my problem.Thanks.
Then I will state my point of view.new_team_split_dts = [d for d in team_split_dts if d['area']>32**2] this code will decrease False Positive samples, like when a detection result is positive but its area less than 32**2 which will be ignored. It may be a little unreasonable (you should stand in the perspective of not knowing the data set when evaluating allRng, you should not add this a priori information with an area larger than 32**2).
P(precision) = TP / (TP + FP) will higher. As recall = TP/(TP + FN) is not affected by that, because (TP + FN) is the total number of ground truth instances not changed.
So I think the problem is the number of positive detection results counted is different for the two evaluations. Does it should be ignored when evaluating allRng?

from coco-analyze.

matteorr commented on June 26, 2024

@baolinhu - Glad the issue is resolved.

To follow up one last time on what might be the "best" evaluation strategy, my interpretation is that since we know that COCO does not have ground truth keypoints for instances with area smaller that 32**2 it is better to ignore detections that have area too small, as they most likely will not be good because of the lack of training data. I agree this strategy might penalize algorithms that are making good keypoint predictions for small instances.

An interesting approach could be to ignore all detections with an area smaller than the minimum area with which an IoU of 0.5 is possible if the detection is perfectly overlapped with a ground truth of size 32**2.

In conclusion, I think there is no definitive "right" or "wrong" way of doing it. As long as you are aware about what are the consequences with either approach, and compare all algorithms using the same technique it shouldn't matter too much.

from coco-analyze.

DanBmh commented on June 26, 2024

coco_analyze = COCOanalyze(coco_gt, coco_dt, 'keypoints')
coco_analyze.params.areaRng = [[0 ** 2, 1e5 ** 2], [96 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2]]
After this change the results will match. I'm closing this issue, please put a thumbs up if you think its solved, or feel free to reopen if you have further comments.

Worked for me, but medium and large results are switched. In comparison with the linked code above it should be:
coco_analyze.params.areaRng = [[0 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]

from coco-analyze.

When analyze coco2017,this code ap is a little higher than original ap? about coco-analyze HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent