Code Monkey home page Code Monkey logo

sec4sr's Introduction

Usage

Requirements

pytorch=1.6.0, torchaudio=0.6.0, numpy=1.19.2, scipy=1.4.1, libkmcuda=6.2.3, torch-lfilter=0.0.3, pesq=0.0.2, pystoi=0.3.3

Dataset Preparation

We provide five datasets, namely, Spk10_enroll, Spk10_test, Spk10_imposter, Spk251_train and Spk_251_test. They cover all the recognition tasks (i.e., CSI-E, CSI-NE, SV and OSI). The code in dataset/Dataset.py will download them automatically. You can also manually download them using the follwing links:

Spk10_enroll, 18MB, MD5:0e90fb00b69989c0dde252a585cead85

Spk10_test, 114MB, MD5:b0f8eb0db3d2eca567810151acf13f16

Spk10_imposter, 212MB, MD5:42abd80e27b78983a13b74e44a67be65

Spk251_train, 10GB, MD5:02bee7caf460072a6fc22e3666ac2187

Spk251_test, 1GB, MD5:182dd6b17f8bcfed7a998e1597828ed6

Model Preparation

Speaker Enroll (CSI-E/SV/OSI tasks)

  • Download iv_system and xv_system, and untar them inside model directory. These contains the pre-trained ivector-PLDA and xvector-PLDA background models.
  • Run enroll_iv.py and enroll_xv.py to enroll the speakers in Spk10_enroll. The information about enroll speakers will be stored in speakers/.

Natural Training (CSI-NE)

  • python defense/natural_train.py -num_epoches 30 -batch_size 128 -model_ckpt 'path to store model' -log 'training log path'
  • See defense/natural_train.py for more arguments and details.

Adversarial Training (CSI-NE)

  • Sole FGSM adversarial training:

    python defense/adver_train.py -attacker FGSM -epsilon 0.002

  • Sole PGD adversarial training:

    python defense/adver_train.py -attacker PGD -epsilon 0.002 -max_iter 10

  • Combining adversarial training with input transformation AT (randomized, should use EOT during training)

    python defense/adver_train.py -attacker PGD -epsilon 0.002 -max_iter 10 -defense AT -defense_param 16 -EOT_size 10 -EOT_batch_size 5

  • See defense/adver_train.py for more arguments and details.

Generate Adversarial Examples

  • python attackMain.py -model_type AudioNet -model_file 'pre-trained model path' -attacker FAKEBOB -epsilon 0.002 -task CSI -root 'root of benign dataset' -name 'name of benign dataset'
  • python attackMain.py -model_type ivector -model_file 'path of speaker_model file gnerated by enroll_iv.py' -attacker FAKEBOB -epsilon 0.002 -task CSI -root 'root of benign dataset' -name 'name of benign dataset'
  • See more detail in attackMain.py

Test the effectiveness of Adversarial Examples

  • python test_attack.py -model_type AudioNet -model_file 'pre-trained model path' 'root of adver dataset' -name 'name of adver dataset'
  • See test_attack.py for more detail

Extension

MC

MC contains three state-of-the-art embedding-based speaker recognition models, i.e., ivector-PLDA, xvector-PLDA and AudioNet. Xvector-PLDA and AudioNet are based on neural networks while ivector-PLDA on statistic model (i.e Gaussian Mixture Model).

The flexibility and extensibility of SEC4SR make it easy to add new models. Just wrap the model as torch.nn.Module and implement make_decision abstract method. See model/Model.py for detail.

DAC

To add new datasets, one just need to define a class inheriting from torch.utils.data.Dataset, just like dataset/Dataset.py.

AC

To incorporate new attack algorithms, one just need to inhert from the class in attack/Attack.py and implement the abstract method attack. See attack/Attack.py for detail.

DEC

All input transformation methods are implemented as standalone python functions, making it easy to extend this methods.

ADAC

All these techniques are as standalone wrappers so that they can be easily plugged into attacks to mount adaptive attacks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.