Comments (2)
Hi, have you found the solution for this?
from beir.
Hi, It's not exactly the same issue, but I've noticed similar issues on the reranking task.
For example, I'm using the following code.
def do_evaluation(queries, qrels, corpus, results=None):
k_values = [1,5,10,20,50,100]
retriever = EvaluateRetrieval()
from beir.reranking.models import CrossEncoder, MonoT5
from beir.reranking import Rerank
cross_encoder_model = MonoT5(mode, token_false='▁false', token_true='▁true')
print(f"Loading cross-encoder model from: {cross_encoder_model.model.config._name_or_path}")
reranker = Rerank(cross_encoder_model, batch_size=256)
results = reranker.rerank(corpus, queries, results, top_k=100) # outputs nan scores to results
results = remove_nan(results) # manually assign score due to bug
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, k_values)
It may be too late, but the order inside the results
is preserved, so the following code can be used as a quick workaround:
def remove_nan(results):
new_res = {}
for query_key in results.keys():
out = {}
for i, corpus_key in enumerate(results[query_key].keys()):
out[corpus_key] = 100 - i
new_res[query_key] = out
return new_res
Using this code, it output correct scores for ndcg, recall.. and so on.
from beir.
Related Issues (20)
- Loosen Elasticsearch requirement HOT 1
- No module named 'tensorflow_text' HOT 3
- [Question] Using T5 as a retriveal model in BeIR HOT 2
- AssertionError: Elastic-Search Window too large, Max-Size = 10000
- Training script/configuration of T5 generator for GenQ HOT 2
- Can't reproduce BM25 baselines HOT 7
- adding a simple implementation of ColBERT HOT 6
- Don't rebuild bm25 pyserini index every time when running the script
- evaluate_anserini_bm25.py retrieves 1000 documents for each query no matter which k I set in payload HOT 2
- Reproduce Signal-1M HOT 2
- Redirect Anserini BM25 baselines in BEIR to Pyserini
- Missing test.tsv in thakur/BEIR/datasets/nq-train.zip HOT 1
- org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024
- Typing for retriever is too strict: implement abstract retriever base class
- Fine-tuned Cross-encoder scores are lower than ms-marco zero-shot scores ? HOT 1
- Installation HOT 3
- FIQA Empty Rows HOT 3
- mMarco version
- Extras-require, base package without Faiss-CPU requirement HOT 1
- Malformed Chinese mMARCO dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from beir.