Code Monkey home page Code Monkey logo

ninerec's Introduction

NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation

Dataset

We have released all 9 downstream datasets, and we will provide access to the source dataset once the paper is accepted. To acquire the complete NineRec dataset, kindly reach out to the corresponding author via email. If you have an innovative idea for building a foundational recommendation model but require a large dataset and computational resources, consider joining our lab as an intern. We can provide access to 100 NVIDIA 80G A100 GPUs and a billion-level dataset of user-video/image/text interactions.

Download link:

If you are interested in conducting pre-training, you can find a relatively large image dataset available at https://github.com/westlake-repl/IDvs.MoRec. Please follow the provided instructions to utilize the dataset properly, as it is not fully published yet. If you want to pre-train on a very large-scale image/video/text dataset for a foundation Recsys model, contact our leading authors by email.

Citation

If you use our dataset, code or find NineRec useful in your work, please cite our paper as:

@article{zhang2023ninerec,
      title={NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation}, 
      author={Jiaqi Zhang and Yu Cheng and Yongxin Ni and Yunzhu Pan and Zheng Yuan and Junchen Fu and Youhua Li and Jie Wang and Fajie Yuan},
      journal={arXiv preprint arXiv:2309.07705},
      year={2023}
}

⚠️ Caution: It's prohibited to privately modify the dataset and offer secondary downloads. If you've made alterations to the dataset in your work, you are encouraged to open-source the data processing code, so others can benefit from your methods.

Benchmark

Environments

Pytorch==1.12.1
cudatoolkit==11.2.1
sklearn==1.2.0
python==3.9.12

Dataset Preparation

Run get_lmdb.py to get lmdb database for image loading. Run get_behaviour.py to convert the user-item pairs into item sequences format.

Run Experiments

Run train.py for pre-training and transferring. Run test.py for testing.

Leaderboard

coming soon.

Tenrec

Tenrec (https://github.com/yuangh-x/2022-NIPS-Tenrec) is the sibling dataset of NineRec, which includes multiple user feedback and platforms. It is suitable for studying ID-based transfer and lifelong learning.

News

实验室招聘科研助理、实习生、博士生和博后,请联系通讯作者。

ninerec's People

Contributors

cheungkakei avatar fajieyuan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.