Code Monkey home page Code Monkey logo

msmarco-passage-ranking-submissions's Introduction

MS MARCO Passage Ranking Submissions

This repo holds the official MS MARCO passage ranking leaderboard and describes the process for submitting runs. All associated data for the task (corpus, training data, eval queries, etc.) are available here.

Submission Instructions

To make a submission, please follow these instructions:

  1. Decide on a submission id, which will be a permanent (public) unique key. The submission id should be of the form yyyymmdd-foo, where foo can be a suffix of your choice, e.g., your group's name. Please keep the length reasonable. See here for examples. yyyymmdd should correspond to the submission date of your run.

  2. In the directory submissions/, create the following files:

    1. submissions/yyyymmdd-foo/dev.txt.bz2 - run file on the dev queries (msmarco-docdev-queries.tsv), bz2-compressed

    2. submissions/yyyymmdd-foo/eval.txt.bz2 - run file on the eval queries (docleaderboard-queries.tsv), bz2-compressed

    3. submissions/yyyymmdd-foo-metadata.json, in the following format:

       {
         "team": "team name",
         "model_description": "model description",
         "paper": "url",              // URL to paper
         "code": "url",               // URL to code
         "type": "full ranking"       // either 'full ranking' or 'reranking'
       }
      

      Leave the value of paper and code empty (i.e., the empty string) if not available. These fields correspond to what is shown on the leaderboard.

  3. Run our evaluation script to make sure everything is in order (and fix any errors):

    $ python eval/run_eval.py --id yyyymmdd-foo
  4. Package (i.e., encrypt) the submission using the following script:

    $ eval/pack.sh yyyymmdd-foo
  5. Open a pull request against this repository. The subject (title) of the pull request should be "Submission yyyymmdd-foo", where yyyymmdd-foo is the submission id you decided on. This pull request should contain exactly three files:

    1. submissions/yyyymmdd-foo.key.bin.enc - the encrypted key
    2. submissions/yyyymmdd-foo.tar.enc - the encrypted tarball
    3. submissions/yyyymmdd-foo-metadata.json.enc - the encrypted metadata

IMPORTANT NOTE: You might want to save the unencrypted version of the key you've generated, i.e., submissions/yyyymmdd-foo.key.bin. You'll need it if you want to, for example, change your metadata later on. If you don't keep it, you'll lose it forever, because the pack.sh script generates a random key each time, see here.

Additional Submission Guidelines

The goal of the MS MARCO leaderboard is to encourage coopetition (cooperation + competition) among various groups working on deep learning and other methods for search that requires or benefits from large-scale training data. So, while we encourage friendly competition between different participating groups for top positions on the leaderboard, our core motivation is to ensure that over time the leaderboard provides meaningful scientific insights about how different methods compare to each other and answer questions like whether we are making real progress as a research community. All participants are requested to abide by this spirit of coopetition and strictly observe good scientific principles when participating. We will follow an honour system and expect participants to ensure that they are acting in compliance with both the policies and the spirit of this leaderboard. We will also periodically audit all submissions ourselves and may flag issues as appropriate.

Frequency of Submission

The eval set is meant to be a blind set. We want to discourage modeling decisions based eval numbers to avoid overfitting to the set. To ensure this, we request participants to submit:

  1. No more than 2 runs in any given period of 30 days.
  2. No more than 1 run with very small changes, such as different random seeds or different hyper-parameters (e.g., small changes in number of layers or number of training epochs).

Participants who may want to run ablation studies on their models are encouraged to do so on the dev set, but not on the eval set.

Metadata Updates

The metadata you provide during run submission is meant to be permanent. However, we do allow "reasonable" updates to the metadata as long as it abides by the spirit of the leaderboard (see above). These reasons might include adding links to a paper or a code repository, fixing typos, clarifying the description of a run, etc. However, we reserve the right to reject any changes.

It is generally expected that the team description in the metadata file will include the name of the organization (e.g., university or company). In many cases, submissions explicitly list the contributors of the run. It is not permissible to submit a run under an alias (or a generic, nondescript team) to first determine "how you did", and then ask for a metadata change only after you've been shown to "do well". We will reject metadata change requests in these circumstances. Thus, you're advised to make the team description as specific as possible, so that you can claim "credit" for doing well. We further request that your team description unambiguously identify who you are (for example, your identify should be fairly clear given a web search). Submissions with metadata containing ambiguous team identifies may be rejected.

To update the metadata of a particular run, you'll need to encrypt a new metadata JSON file with the same key that you used in the original submission. The command to encrypt the metadata is here. Hopefully, you've saved the key? If you've lost it, get in touch with us and we'll send you the key back via another channel (e.g., email). Once you've created a new metadata JSON file (i.e., submissions/yyyymmdd-foo-metadata.json.enc), send us a pull request with it. Please make the subject of the pull request something obvious like "Metadata change for yyyymmdd-foo". Also, please make it clear to us that you have "permission" to change the metadata, e.g., the person making the change request is the same person who performed the original submission.

Anonymous Submissions

We allow anonymous submissions. Note that the purpose of an anonymous submission is to support blind reviewing for corresponding publications, not as a probing mechanism to see how well you do, and then only make your identity known if you do well.

Anonymous submissions should still contain accurate team and model information in the metadata JSON file, but on the leaderboard we will anonymize your entry. By default, we allow an embargo period of anonymous submissions for up to nine months. That is, after nine months, your identity will be revealed and the leaderboard will be updated accordingly. Additional extensions to the embargo period based on exceptional circumstances can be discussed on a case-by-case basis; please get in touch with the organizers.

For an anonymous submission, the metadata JSON file should have an additional field:

"embargo_until": "yyyy/mm/dd"

Where the date in yyyy/mm/dd format cannot be more than nine months past the submission date. For example, if the submission date is 2020/11/01, the longest possible embargo period is 2021/07/31. Of course, you are free to specify a shorter embargo period if you wish.

Note that even with an anonymous submission, the submission id is publicly known, as well as the person performing the submission. You might consider using a random string as the submission id, and you might consider creating a separate GitHub account for the sole purpose of submitting an anonymous run. Neither is necessary; we only provide this information for your reference.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Legal Notices

Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.

Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.

Privacy information can be found at https://privacy.microsoft.com/en-us/

Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.

msmarco-passage-ranking-submissions's People

Contributors

afalf avatar autoliuweijie avatar bmitra-msft avatar cdh4696 avatar coris1207 avatar crystina-z avatar deriq-qian-dong avatar emmagerritse avatar f766ece33d1fd43bfa618eb779e9ea5b avatar hongleizhuang avatar intfloat avatar iwvvwl avatar jadepark13 avatar kkwarchiefs avatar lintool avatar ma787639046 avatar microsoft-github-operations[bot] avatar microsoftopensource avatar msmarco-bot avatar mxueguang avatar nirmal2k avatar nullzyz avatar revl147 avatar rodrigonogueira4 avatar searchivarius avatar tangzhy avatar yiyaxiaozhi avatar zyznull avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

msmarco-passage-ranking-submissions's Issues

How to evaluate results for past TREC DL tracks

I finetuned my model with train data,
"Top 1000 Train top1000.train.tar.gz 175.0 GB 478,002,393 tsv: qid, pid, query, passage"
from ( https://microsoft.github.io/msmarco/TREC-Deep-Learning-2019)
And I got results on both dev
"Top 1000 Dev top1000.dev.tar.gz 2.5 GB 6,668,967 tsv: qid, pid, query, passage"
and test,
"Test msmarco-passagetest2019-top1000.tsv 71 MB 189,877 tsv: qid, pid, query, passage"
with the finetuned model.

However, I don’t know how to submit my result. I read the guideline, but it does not provide detailed information on submitting previous years’ results. Because I plan to do experiments on TREC 2019, 2020, 2021 and 2022 Deep Learning track, knowing how to evaluate these results is important for me.

Can you provide detailed information on how to evaluate or submit previous years’ results?
Looking forward to your reply.

How long will the board update?

Hi, I have submitted a new result a week ago and successfully merged, but I notice that the result is not updated on the leadee board. So I want to know that how long will the board update? Thanks!

Question about uploading yyyymmdd.tar.enc

After I run the pack.sh, yyyymmdd.tar.enc file was generated and it has 78MB.
It is not allowed to upload this file directly on the GitHub "add file " button since its size is over 25MB. And I try to use the git lfs tool and finally get the information about "git lfs does not support pushing the file to public fork". How do I push the *.tar.enc to the MSMARCO-Passage-Ranking-Submissions fork?

Could you please update the leaderboard?

Hi,

I am so sorry to bother you, but could you please update the leaderboard?

Since the evaluation results are not officially announced, I cannot use the evaluation result on any materials, including papers and presentations.

If it takes some time to implement the automated update script, could you please manually update the leaderboard excel file? I really want to use the evaluation result in my Friday presentation.

Many thanks for considering my request.

Questions About Sumission

Hi,I submitted the results of the assessment set two weeks ago and have not received feedback yet. Did the submission process change?

Cannot run pack.sh in macOS 11.6

I cannot run this command from eval/pack.sh in my macOS 11.6.

openssl enc -aes-256-cbc -salt -in submissions/"${1}"-metadata.json \
  -out submissions/"${1}"-metadata.json.enc -pass file:submissions/"${1}".key.bin -pbkdf2

I found that I can run it after removing -pbkdf2 parameter, but I don't know if it will be evaluated properly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.