Code Monkey home page Code Monkey logo

re-move's People

Contributors

furkanyesiler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

re-move's Issues

Some bug in evaluation metrics with null positives

(Tagging @helenkesete )

We were looking at the evaluation metrics implemented here, and ran into what I think is a pretty subtle bug in the MRR calculation.

re-move/utils/metrics.py

Lines 27 to 32 in 5ddd5df

_, spred = torch.topk(ypred, k, dim=1)
found = torch.gather(ytrue, 1, spred)
temp = torch.arange(k).float() * 1e-6
_, sel = torch.topk(found - temp, 1, dim=1)
mrr = torch.mean(1/(sel+1).float())

The above code identifies the position of the first positive result if one exists. However, if no positive result exists, then the topk call will just return the first position because found = [0, 0, 0, 0, ...] so found - temp = [0, -1e-6, -2e-6, -3e-6, ...]. This results in an inflated metric for queries with null ytrue sets.

Looking a bit further down in the evaluation, I noticed that this case is handled correctly(*) in meanAP:

re-move/utils/metrics.py

Lines 40 to 41 in 5ddd5df

ap = torch.sum(prec*mask, 1)/(torch.sum(ytrue, 1)+eps)
ap = ap[torch.sum(ytrue, 1) > 0]

where the mean over AP scores is restricted only to those queries where sum(ytrue) > 0.

There are two caveats to this:

  1. meanAP is potentially averaged over a different query set than the other metrics, which seems not ideal.
  2. when return_mean=False is passed in, the returned vector of per-query AP scores is the one already conditioned on having positive results, but the evaluator has lost track of the corresponding indices. This will make it difficult to link back to the input data later on.

So I have a couple of proposed modifications:

    # Identify queries with positive results
    has_positives = torch.sum(ytrue, 1) > 0

    _, spred = torch.topk(ypred, k, dim=1)
    found = torch.gather(ytrue, 1, spred)

    temp = torch.arange(k).float() * 1e-6
    _, sel = torch.topk(found - temp, 1, dim=1)

    # Knock out queries with no positives
    sel = sel.float()
    sel[~has_positives] = torch.nan

    mrr = torch.nanmean(1/(sel+1).float())
    mr = torch.nanmean((sel+1).float())
    top1 = torch.sum(found[:, 0])
    top10 = torch.sum(found[:, :10])

    pos = torch.arange(1, spred.size(1)+1).unsqueeze(0).to(ypred.device)
    prec = torch.cumsum(found, 1)/pos.float()
    mask = (found > 0).float()
    ap = torch.sum(prec*mask, 1)/(torch.sum(ytrue, 1)+eps)
    ap[~has_positives] = torch.nan

    if print_metrics:
        print('mAP: {:.3f}'.format(ap.nanmean().item()))
        print('MRR: {:.3f}'.format(mrr.item()))
        print('MR: {:.3f}'.format(mr.item()))
        print('Top1: {:.0f}'.format(top1.item()))
        print('Top10: {:.0f}'.format(top10.item()))
    return ap.nanmean() if reduce_mean else ap

Quick summary:

  • replace means by nanmeans
  • populate nans for MR and MRR results on queries with null result sets

As an alternative / suggestion: you might consider accepting a fill_nan= parameter here, which could replace nans by zeros. The logic here being that 0 could be a reasonable limiting value for MRR and meanAP (not MR though) on null result queries, and it could be reasonable to include them in some situations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.