furkanyesiler / re-move Goto Github PK
View Code? Open in Web Editor NEWTraining and evaluation code for Re-MOVE models with embedding distillation
License: GNU Affero General Public License v3.0
Training and evaluation code for Re-MOVE models with embedding distillation
License: GNU Affero General Public License v3.0
Where can I find the the Da-Tacos training subset?
Thanks in advance!
(Tagging @helenkesete )
We were looking at the evaluation metrics implemented here, and ran into what I think is a pretty subtle bug in the MRR calculation.
Lines 27 to 32 in 5ddd5df
The above code identifies the position of the first positive result if one exists. However, if no positive result exists, then the topk
call will just return the first position because found = [0, 0, 0, 0, ...]
so found - temp = [0, -1e-6, -2e-6, -3e-6, ...]
. This results in an inflated metric for queries with null ytrue
sets.
Looking a bit further down in the evaluation, I noticed that this case is handled correctly(*) in meanAP:
Lines 40 to 41 in 5ddd5df
where the mean over AP scores is restricted only to those queries where sum(ytrue) > 0
.
There are two caveats to this:
return_mean=False
is passed in, the returned vector of per-query AP scores is the one already conditioned on having positive results, but the evaluator has lost track of the corresponding indices. This will make it difficult to link back to the input data later on.So I have a couple of proposed modifications:
# Identify queries with positive results
has_positives = torch.sum(ytrue, 1) > 0
_, spred = torch.topk(ypred, k, dim=1)
found = torch.gather(ytrue, 1, spred)
temp = torch.arange(k).float() * 1e-6
_, sel = torch.topk(found - temp, 1, dim=1)
# Knock out queries with no positives
sel = sel.float()
sel[~has_positives] = torch.nan
mrr = torch.nanmean(1/(sel+1).float())
mr = torch.nanmean((sel+1).float())
top1 = torch.sum(found[:, 0])
top10 = torch.sum(found[:, :10])
pos = torch.arange(1, spred.size(1)+1).unsqueeze(0).to(ypred.device)
prec = torch.cumsum(found, 1)/pos.float()
mask = (found > 0).float()
ap = torch.sum(prec*mask, 1)/(torch.sum(ytrue, 1)+eps)
ap[~has_positives] = torch.nan
if print_metrics:
print('mAP: {:.3f}'.format(ap.nanmean().item()))
print('MRR: {:.3f}'.format(mrr.item()))
print('MR: {:.3f}'.format(mr.item()))
print('Top1: {:.0f}'.format(top1.item()))
print('Top10: {:.0f}'.format(top10.item()))
return ap.nanmean() if reduce_mean else ap
Quick summary:
As an alternative / suggestion: you might consider accepting a fill_nan=
parameter here, which could replace nans by zeros. The logic here being that 0 could be a reasonable limiting value for MRR and meanAP (not MR though) on null result queries, and it could be reasonable to include them in some situations.
The readme states that it will be updated soon. Are these updates ready?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.