Code Monkey home page Code Monkey logo

rand-perturbations-defense's Introduction

Using Random Perturbations to Mitigate Adversarial Attacks on NLP Models

Motivation

Deep learning models have excelled in solving many difficult problems in Natural Language Processing (NLP), but it has been demonstrated such models to be susceptible to extensive vulnerabilities. These NLP models are vulnerable to adversarial attacks which is exacerbated by the use of public datasets that typically are not manually inspected before use. We offer two defense methods against these attacks that use random perturbations. We performed tests that showed our Random Perturbations Defense and our Increased Randomness Defense to both be effective in returning attacked models to their original accuracy (before attacks) for 6 different attacks from the TextAttack library.

Dependencies

Defense Methods

  • Natural Language Toolkit (nltk)
  • pyspellchecker
  • transformers
  • clean-text

Data Generation

  • TextAttack
  • transformers
  • tqdm 4.61

Data & Data Generation

We use the TextAttack library to generate data to test on. The data we used to test our methods can be found in the attacked_data folder. The following attack methods were used to generate 100 perturbed reviews from the IMDB dataset:

  • BERT-Based Adversarial Examples (BAE)
  • DeepWordBug
  • Faster Alzantot Genetic Algorithm
  • Kuleshov
  • Probability Weighted Word Saliency (PWWS)
  • TextBugger
  • TextFooler

More data can be generated using our data_generation file that exists in the attacked_data folder. You must select the model_wrapper to use and the attack recipe model you would like to use. You can change the number of examples to create by changing the num_examples variable in this line:

attack_args = textattack.AttackArgs(num_examples=100, shuffle=True, silent=True)

Using our Defense Methods

Both methods exist within the defense_methods folder. Our Random Peturbations Defense has options for tuning with the parameters l and k which have the following properties:

  • l = number of replicates made for each sentence in a review
  • k = number of random corrections made for each replicate

We recommend using l = 7 and k = 5 for a starting point as this is what we recieved our best results with. You can also choose which attack method you would like to test on. The perturbed data in the attacked_data folder is preset as options in our existing code. You can change the value of data_to_use to choose the data you want to use for your tests.

Our Increased Randomness Defense has options for tuning with a single parameter k which has the following property:

  • k = number of replicates made from randomly selected sentences in a review

We recommend using k = 41 for a starting point as this is what we recieved our best results with. Again, you can choose which attack method you would like to test on using the same steps as the previous defense method.

Credits

TextAttack Library
๐Ÿค—HuggingFace Transformers Library

rand-perturbations-defense's People

Contributors

aswenor avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.