Code Monkey home page Code Monkey logo

diffir's Introduction

License Documentation Status Worfklow PyPI version fury.io Code style: black

DiffIR

DiffIR is a tool for visually 'diffing' the difference between two sets of rankings. Given a pair of TREC runs containing rankings for multiple queries, DiffIR identifies contrasting queries that have "substantially" different results between the two systems and generates a visual side-by-side comparison illustrating how the key rankings differ.

DiffIR supports multiple query contrast meastures for ranking comparison including unsupervised ranking correlations like TauAP and supervised comparison based on existing judgments. DiffIR additionally accepts term importance weights in order to highlight the terms most relevant to a model's relevance prediction.

Usage Open In Colab

Installation

Python 3 is required. Install via PyPI:

pip install diffir

Usage

Download two run files to test with:

wget -c https://github.com/capreolus-ir/diffir/raw/master/trec-dl-2020/p_bm25
wget -c https://github.com/capreolus-ir/diffir/raw/master/trec-dl-2020/p_bm25rm3

Compare the two files and output a comparison page to bm25_bm25rm3.html:

diffir p_bm25 p_bm25rm3 -w --dataset msmarco-passage/trec-dl-2020 \
       --measure qrel --metric nDCG@5 --topk 3 > bm25_bm25rm3.html

Now open bm25_bm25rm3.html in your web browser. You should see DiffIR's web interface:

Command line arguments

Usage: diffir <run files> <options> where the run files are 1 or 2 positional arguments indicating the run files to visualize, and <options> are:

  • -w to output HTML or -c for the command line interface
  • --dataset <id>: a dataset id from ir_datasets
  • --measure <measure> the query contrast measure to use. Valid measures: qrel, tauap, pearsonrank, weightedtau, spearmanr, kldiv (using scores)
  • --metric <metric>: the relevance metric to use with the qrel measure. Accepts ir_measures notation
  • --topk <k>: the number of queries to compare (as identified by the query contrast measure)
  • --weights_1 <file>, --weights_2 <file>: term importance files to use for snippet selection

Batch mode

Use diffir-batch to generate comparison pages for every pair of run files in a directory.

Usage: diffir-batch <input directory> -o <output directory> <options> where the <options> are those shown above.

diffir's People

Contributors

andrewyates avatar kevinmartinjos avatar thongnt99 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.