Code Monkey home page Code Monkey logo

context_attentive_ir's Introduction

Context-aware Neural Information Retrieval

Introduction

PyTorch code for our ICLR 2018 and SIGIR 2019 papers.

The codebase contains source-code of 8 document ranking models, 3 query suggestions models and 3 multi-task context-aware ranking and suggestion models.

Document Ranking Models
Query Suggestion Models

Please note, we have a simplified implementation of ACG.

Multi-task Learning Models

Requirements

Training/Testing Models

$ cd  scripts
$ bash SCRIPT_NAME GPU_ID MODEL_NAME
  • To train/test document ranking models, use ranker.sh in place of SCRIPT_NAME
  • To train/test query suggestion models, use recommender.sh in place of SCRIPT_NAME
  • To train/test multitask models, use multitask.sh in place of SCRIPT_NAME

Here is a list of models which you can use in place of MODEL_NAME.

  • Document Ranking Models: esm, dssm, cdssm, drmm, arci, arcii, duet, match_tensor
  • Query Suggestion Models: seq2seq, hredqs, acg
  • Multitask Models: mnsrf, m_match_tensor, cars

For example, if you want to run our CARS model, run the following command.

bash multitask.sh GPU_ID cars
Running experiments on CPU/GPU/Multi-GPU
  • If GPU_ID is set to -1, CPU will be used.
  • If GPU_ID is set to one specific number, only one GPU will be used.
  • If GPU_ID is set to multiple numbers (e.g., 0,1,2), then parallel computing will be used.

An Artificial Dataset

We are unable to make our experimental dataset publicly available. However, we are sharing scripts to create an artificial dataset from MSMARCO Q&A v2.1 and MSMARCO Conversational Search datasets. Please run the script by going into the /data/msmarco/ directory. Once the data is generated, you should be able to see a table showing the following statistics.

Attribute Train Dev Test
Sessions 223876 24832 27673
Queries 1530546 169413 189095
Avg Session Len 6.84 6.82 6.83
Avg Query Len 3.84 3.85 3.84
Max Query Len 40 32 32
Avg Doc Len 63.41 63.43 63.48
Max Doc Len 290 290 290
Avg Click Per Query 1.05 1.05 1.05
Max Click Per Query 6 6 6

Results on the Artificial Dataset

Coming soon!

Acknowledgement

I borrowed and modified code from DrQA, OpenNMT. I would like to expresse my gratitdue for authors of these repositeries.

Citation

If you find the resources in this repo useful, please cite our works.

@inproceedings{Ahmad:2019:CAD:3331184.3331246,
 author = {Ahmad, Wasi Uddin and Chang, Kai-Wei and Wang, Hongning},
 title = {Context Attentive Document Ranking and Query Suggestion},
 booktitle = {Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
 year = {2019},
 pages = {385--394}
} 
@inproceedings{uddin2018multitask,
 title={Multi-Task Learning for Document Ranking and Query Suggestion},
 author={Wasi Uddin Ahmad and Kai-Wei Chang and Hongning Wang},
 booktitle={International Conference on Learning Representations},
 year={2018}
}

context_attentive_ir's People

Contributors

wasiahmad avatar niazangels avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.