Code Monkey home page Code Monkey logo

lm-ood's Introduction

Tuning Free OOD Detection

This repository contains the official code for the paper Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection (ACL 2023).


Setup

First, create a virtual environment for the project (we use Conda to create a python 3.9 environment) and install all the requirments using requirements.txt.

  1. conda create -n ood_det python==3.9
  2. conda activate ood_det
  3. pip install -r requirements.txt

Now, create two directories:

  1. Data directory
  2. Model directory - create two subdirectories: pretrained_models and finetuned_models

Both directories can be located anywhere but should be specified in config.py (See more below).


Running Experiments

At the start of each session, run the following: . bin/start.sh

Running OOD Detection on a Model

Directory paths and training arguments can be specified in config.py. Some important arguments are:

  • DATA_DIR: Path to the data directory
  • MODEL_DIR: Path to the model directory
  • task_name: Name of the dataset to use as in-distribution (ID) data.
  • ood_datasets: List of datasets to use as out-of-distribution (OOD) data.
  • model_class: Type of model to use. Options are roberta, gpt2and t5 (base versions of all models used).
  • do_train: If true, trains the model on the ID data before performing OOD detection.

After specifying the arguments, run the following command: python run_ood_detection.py

Training a Model through TAPT

Running the command below will extend the pretraining process on a specified dataset. Once again, the values in config.json can be used to specify the dataset and model to be used. python tapt_training/pretrain_roberta.py

After the new model has been saved to the pretrained_models directory within the MODELS_DIR directory (specified in config.py), it can be used for OOD detection by running run_ood_detection.py.


Citation

If you find this repo helpful, you are welcome to cite our work: <--@inproceedings{uppaal2023fine, title={Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection}, author={Rheeya Uppaal and Junjie Hu and Yixuan Li }, booktitle = {Annual Meeting of the Association for Computational Linguistics}, year = {2023} }-->

@inproceedings{uppaal-etal-2023-fine,
    title = "Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection",
    author = "Uppaal, Rheeya  and Hu, Junjie  and Li, Yixuan",
    editor = "Rogers, Anna  and Boyd-Graber, Jordan  and Okazaki, Naoaki",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.717",
    doi = "10.18653/v1/2023.acl-long.717",
    pages = "12813--12832",
}

Our codebase borrows from the following:

@inproceedings{zhou2021contrastive,
  title={Contrastive Out-of-Distribution Detection for Pretrained Transformers},
  author={Zhou, Wenxuan and Liu, Fangyu and Chen, Muhao},
  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
  pages={1100--1111},
  year={2021}
}

@article{liu2020tfew,
  title={Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning},
  author={Liu, Haokun and Tam, Derek and Muqeeth, Mohammed and Mohta, Jay and Huang, Tenghao and Bansal, Mohit and Raffel, Colin},
  journal={arXiv preprint arXiv:2205.05638},
  year={2022}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.