Code Monkey home page Code Monkey logo

raat's Introduction

Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training

arXiv: Abstract / PDF

📣 News

  • [16/May/2024] 🎉 Our paper is accepted by ACL 2024 Main Conference(The 62nd Annual Meeting of the Association for Computational Linguistics)!

✨ Abstract

Large Language Models (LLMs) exhibit substantial capabilities yet encounter challenges, including hallucination, outdated knowledge, and untraceable reasoning processes. Retrieval-augmented generation (RAG) has emerged as a promising solution, integrating knowledge from external databases to mitigate these challenges. However, inappropriate retrieved passages can potentially hinder the LLMs' capacity to generate comprehensive and high-quality responses. Prior RAG studies on the robustness of retrieval noises often confine themselves to a limited set of noise types, deviating from real-world retrieval environments and limiting practical applicability. In this study, we initially investigate retrieval noises and categorize them into three distinct types, reflecting real-world environments. We analyze the impact of these various retrieval noises on the robustness of LLMs. Subsequently, we propose a novel RAG approach known as Retrieval-augmented Adaptive Adversarial Training (RAAT). RAAT leverages adaptive adversarial training to dynamically adjust the model's training process in response to retrieval noises. Concurrently, it employs multi-task learning to ensure the model's capacity to internally recognize noisy contexts. Extensive experiments demonstrate that the LLaMA-2 7B model trained using RAAT exhibits significant improvements in F1 and EM scores under diverse noise conditions.

✨ The overview of RAAT

💪 Dataset

Data Preparation

We provide the RAG-Bench for training and testing, available at https://drive.google.com/file/d/1i4umieNgG3dctNqdTMI3Rj5tsrR5JvnM/view?usp=sharing

retrieval_robustness_benchmark

  • train.json
  • dev.json
  • test.json
  1. train_data: 4500 samples.

    best_ctx: golden retrieval

    ad_ctx: [relevant retrieval noise,counterfactual retrieval noise,irrelevant retrieval noise]

  2. dev_data: 300 samples.

  3. test_data: 3000 samples.

    Golden retrieval : choose from "best_ctx".

    Relevant retrieval noise : choose from "ctxs".(without answers)

    Irrelevant retrieval noise: choose from other samples.

    Counterfactual retrieval noise: choose from "counter_fac".

The test data used in the paper: RAAT\benchmark_cache.

golden retrieval:o.json

golden retrieval + relevant retrieval noise:p.json

golden retrieval + irrelevant retrieval noise:c.json

golden retrieval + counterfactual retrieval noise:f.json

The training data used in the paper:RAAT\tuner\data\temp.json.

You can download temp.json with the following link: https://drive.google.com/file/d/109CVe8KWiYdpZLkz4nZjDZklYdUjxaZ2/view?usp=sharing

What is the difference between the training and test data we used in the paper and RAG-Bench?

The training and test data we used in the paper are subsets of RAG-Bench because RAG-Bench provides multiple noise retrieval samples for different retrieval noises. However, in testing or training, we only need to use one noise retrieval sample for each type of retrieval noise. To control the randomness brought by the selection of retrieval samples on the results, we cache the selected test and training data. If you want to reproduce the results in the paper, it is best to use our selected data cache.

💪 Usage

Train

We provide the training scripts for training the model. For example, you can run the following commands to train the model:

cd RAAT
pip install -r requirements.txt
mkdir checkpoints
mkdir logs
cp -r path_to_retrieval_robustness_benchmark  ./tuner/data/
cp path_to_temp ./tuner/data/
cd scripts
bash train.sh

The scripts can be easily modified to train LLMs with different datasets.

Note: Before running, the model_name_or_path has to be specified. Additionally, please download RAG-Bench and temp.json

Test

The following command can be used to test the model:

cd RAAT
cd scripts
bash test.sh

Note: Before running, the test_model_name_or_path has to be specified.

🔓 Citation

If this work is helpful to you, welcome to cite our paper as:

@article{fang2024enhancing,
  title={Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training},
  author={Fang, Feiteng and Bai, Yuelin and Ni, Shiwen and Yang, Min and Chen, Xiaojun and Xu, Ruifeng},
  journal={arXiv preprint arXiv:2405.20978},
  year={2024}
}

raat's People

Contributors

calubkk avatar

Stargazers

Longze Chen avatar Shiping avatar

Watchers

 avatar

Forkers

oppurity12

raat's Issues

code release

Hi, @calubkk , thanks for your nice work!

BTW, when will the code and collected data be released?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.