Code Monkey home page Code Monkey logo

air-asvspoof's Introduction

AIR-ASVspoof

GitHub | IEEE Xplore | arXiv

This repository contains the official implementation of our SPL paper, "One-class Learning Towards Synthetic Voice Spoofing Detection."

[poster] [slides] [video] [Project webpage]

Video

Updates

[Jun. 2023] We further improved the loss function by proposing SAMO algorithm (Speaker Attractor Multi-Center One-Class Learning) @ ICASSP 2023 (Ding et al. 2023). GitHub

[Feb. 2023] We investigated one-class learning more and included new loss functions. Check out the book chapter published in Handbook of Biometric Anti-Spoofing (Zhang et al. 2023). GitHub

[Sep. 2021] This version of the code used LFCC+ResNet as the backbone. The LFCC feature was implemented with MATLAB, and ResNet was implemented with PyTorch. If you would like full Python code, please check out our follow-up work @ Interspeech 2021 (Zhang et al. 2021). GitHub

Requirements

python==3.6

pytorch==1.1.0

Data Preparation

The LFCC features are extracted with the MATLAB implementation provided by the ASVspoof 2019 organizers. Please first run the process_LA_data.m with MATLAB, and then run python3 reload_data.py with python. Make sure you change the directory path to the path on your machine.

Run the training code

Before running the train.py, please change the path_to_database, path_to_features, path_to_protocol according to the files' location on your machine.

python3 train.py --add_loss ocsoftmax -o ./models/ocsoftmax --gpu 0

Run the test code with trained model

You can change the model_dir to the location of the model you would like to test with.

python3 test.py -m ./models/ocsoftmax -l ocsoftmax --gpu 0

Citation

@ARTICLE{zhang2021one,
  author={Zhang, You and Jiang, Fei and Duan, Zhiyao},
  journal={IEEE Signal Processing Letters}, 
  title={One-Class Learning Towards Synthetic Voice Spoofing Detection}, 
  year={2021},
  volume={28},
  number={},
  pages={937-941},
  abstract={Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion. Recently, researchers developed anti-spoofing techniques to improve the reliability of ASV systems against spoofing attacks. However, most methods encounter difficulties in detecting unknown attacks in practical use, which often have different statistical distributions from known attacks. Especially, the fast development of synthetic voice spoofing algorithms is generating increasingly powerful attacks, putting the ASV systems at risk of unseen attacks. In this work, we propose an anti-spoofing system to detect unknown synthetic voice spoofing attacks (i.e., text-to-speech or voice conversion) using one-class learning. The key idea is to compact the bona fide speech representation and inject an angular margin to separate the spoofing attacks in the embedding space. Without resorting to any data augmentation methods, our proposed system achieves an equal error rate (EER) of 2.19% on the evaluation set of ASVspoof 2019 Challenge logical access scenario, outperforming all existing single systems (i.e., those without model ensemble).},
  keywords={},
  doi={10.1109/LSP.2021.3076358},
  ISSN={1558-2361},
  month={},}

Follow-up works

Please check out our follow-up work:

[1] Zhang, Y., Zhu, G., Jiang, F., Duan, Z. (2021) An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems. Proc. Interspeech 2021, 4309-4313, doi: 10.21437/Interspeech.2021-1820 [link] [arXiv] [code] [video]

[2] Chen, X., Zhang, Y., Zhu, G., Duan, Z. (2021) UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021. Proc. 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, 75-82, doi: 10.21437/ASVSPOOF.2021-12 [link] [arXiv] [code] [video]

[3] Zhang, Y., Jiang, F., Zhu, G., Chen, X., & Duan, Z. (2023). Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation. In Handbook of Biometric Anti-Spoofing: Presentation Attack Detection and Vulnerability Assessment (pp. 421-443). [link] [code]

[4] Ding, S., Zhang, Y., & Duan, Z. (2023). Samo: Speaker attractor multi-center one-class learning for voice anti-spoofing. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). [link] [arXiv] [code] [video]

air-asvspoof's People

Contributors

fjiang9 avatar suchitreddi avatar yzyouzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

air-asvspoof's Issues

how to get cm_scores for eval set when labels are not present

hey,

Thanks for the repository!
I'm trying to test the model with ocsoftmax loss on a test set which do not contains labels but to surprise in the file test.py also labels are being used, which makes it impossible to calculate CM scores without the use of labels at least when using ocsoftmax or amsoftmax loss. Am I missing something? Please be patient and gentle. I have just started working on this topic.

Hope you reply soon.
Thanks

Here is a snapshot of the test.py for your reference.
Screenshot from 2021-06-02 20-06-37

750帧的训练?

哈喽啊,请问尝试过其它帧的训练策略吗?比如300帧或者500帧,效果如何呢?

Error while running process_LA_data.m

Hi,

Thank you for the repository .
I am trying to run process_LA_data.m. I am very new to MATLAB/OCTAVE .
So I am unable to find a solution for the following problem:
image

I guess this is a silly issue. But I'm unable to find a solution for this

score 大于1

为什么有的真实样本推理出的score会大于1呢?正常来讲不应该是-1到1之间吗?

MATLAB process_LA_data error

I tried running the LA data file but it gives error
Error using textscan
Invalid file identifier. Use fopen to generate a valid file identifier.

Error in process_LA_data (line 23)
trainprotocol = textscan(trainfileID, '%s%s%s%s%s');

I am new at matlab and this seems like a stupid error. Kindly help me on this.

from torch._C import * Import Error

Terminal Input:
python train.py --add_loss ocsoftmax -o ./models1028/ocsoftmax/train --gpu 1

Terminal Output:
Traceback (most recent call last):
File "train.py", line 5, in
from resnet import setup_seed, ResNet
File "D:\Programming\Python\Python\AIR-ASVspoof\resnet.py", line 1, in
import torch
File "D:\Programming\Python\Python\venv\lib\site-packages\torch_init_.py", line 79,
in <module>
from torch._C import *
ImportError: DLL load failed: The specified procedure could not be found.

Comments: The same repeats even for --gpu 0. Tried reinstalling torch version 1.1.0 with and without GPU.

For additional information; the same code run on python terminal gave the following output:

D:\Programming\Python\Python\venv\Scripts\python.exe C:/Users/Suchit/AppData/Roaming/JetBrains/IntelliJIdea2022.2/plugins/python/helpers/pydev/pydevconsole.py --mode=client --host=127.0.0.1 --port=62758
import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['D:\Programming\Python\Python'])
PyDev console: starting.
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32
runfile('D:\Programming\Python\Python\AIR-ASVspoof-Suchit\train.py', wdir='D:\Programming\Python\Python\AIR-ASVspoof-Suchit')
Traceback (most recent call last):
File "C:\Users\Suchit\AppData\Local\Programs\Python\Python36\lib\code.py", line 91, in runcode
exec(code, self.locals)
File "", line 1, in
File "C:\Users\Suchit\AppData\Roaming\JetBrains\IntelliJIdea2022.2\plugins\python\helpers\pydev_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Users\Suchit\AppData\Roaming\JetBrains\IntelliJIdea2022.2\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:\Programming\Python\Python\AIR-ASVspoof-Suchit\train.py", line 5, in
from resnet import setup_seed, ResNet
File "C:\Users\Suchit\AppData\Roaming\JetBrains\IntelliJIdea2022.2\plugins\python\helpers\pydev_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "D:\Programming\Python\Python\AIR-ASVspoof-Suchit\resnet.py", line 1, in
import torch
File "C:\Users\Suchit\AppData\Roaming\JetBrains\IntelliJIdea2022.2\plugins\python\helpers\pydev_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self.system_import(name, *args, **kwargs)
File "D:\Programming\Python\Python\venv\lib\site-packages\torch_init
.py", line 79, in
from torch._C import *
File "C:\Users\Suchit\AppData\Roaming\JetBrains\IntelliJIdea2022.2\plugins\python\helpers\pydev_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
ImportError: DLL load failed: The specified procedure could not be found.

Please look into this @yzyouzhang.
Thank you.

Minor code errors in process_LA_data.m

I don't have much experience with Matlab, so there may be a few mistakes. I apologize for this in advance.

Desired folder not formed
Line 14: pathToFeatures = horzcat('/home/yzh298/anti-spoofing/ASVspoof2019', access_type, 'Features/');
Add / after ASVspoof2019 and before Features to make them into different folders; the above code results in a folder named 'ASVspoof2019LAFeatures'.
Fix: After adding /'s, the code becomes;
pathToFeatures = horzcat('/home/yzh298/anti-spoofing/ASVspoof2019/', access_type, '/Features/');
and the resulting path will be ASVspoof2019/LA/Features.

Incorrect file name
Line 17: trainProtocolFile = fullfile(pathToDatabase, horzcat('ASVspoof2019_', access_type, '_cm_protocols'), horzcat('ASVspoof2019.', access_type, '.cm.train.trl.txt'));
Here, the last part of the path has '.cm.train.trl.txt'. The original dataset folder 'DS_10283_3336' has a file named ASVspoof2019.LA.com.train.trn.txt in the above path. So the code displays a ' file not found ' error.
Fix: Replace the trl in the file name with trn.

单卡改多卡

请问将代码改为多卡时,需要将ocsoftmax-loss 包装到DP里面吗?

OCLoss and final Linear Layer

Thanks for the good work
After reading the paper and your implementation, I wonder that since this particular loss doesn't use the output of the final linear layer ( which calculate the classification score), how can this layer learn its optimal params?
Hope that you could clarify this point

_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.

Hi. I am running into the following error. I am wondering if you can help me.

Traceback (most recent call last):
  File "[path]/AIR-ASVspoof/dataset.py", line 53, in __getitem__
    feat_mat = pickle.load(feature_handle)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "[path]/AIR-ASVspoof/train.py", line 265, in <module>
    _, _ = train(args)
  File "[path]/Mikul/AIR-ASVspoof/train.py", line 126, in train
    feat, _, _, _ = training_set[29]
  File "[path]/AIR-ASVspoof/dataset.py", line 61, in __getitem__
    feat_mat = pickle.load(feature_handle)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

Thanks

how to do Feature embedding visualization?

hi,author!I noticed the feature visualization in Figure 2 in the paper, I would like to ask for the plotting details here, is it convenient to provide some plotting source code to help?

老哥,你好!

老哥,你好!我看asvspoof2019数据集中的真实语音和伪造语音数量是失衡的,是否考虑过model检测会倾向于spoof?提出的one-class损失函数是针对这种数据失衡的吗?还是说对于均衡数据集也可以达到不错的效果?另外,我看到ASVspoof2019.LA.cm.train.trl.txt文件中,首先是真实样本然后是伪造样本,在model训练过程中并没有shuffle数据,这也是有意为之吗?

continue_training

你好,我看到continue_training的时候只加载了lfcc_model,并没有加载loss model,而是重新开始训练oc-softmax,我理解oc-softmax 也是需要加载的?

Error while applying your model

I have applied your pretrained model on some audio files but it give me error :

RuntimeError: Calculated padded input size per channel: (3 x 752). Kernel size: (9 x 3). Kernel size can't be greater than actual input size

Do you know what is the reason?

Dear

I have a question. How can I differentiate between real and fake samples by score from loss.py (OCSoftmax), The spoof sample has a high score nearly 1 and bonafide sample has a low score nearly -1, what is the threshold to distinguish them, is it "0"? It seems to me that the output of a binary classification model is a probability,such as [0.1,0.9], 0.1---> bonafide sample, 0.9---> spoof sample。But how can i do like this output with your trained model?

模型训练过程中维度不统一

您好,您在模型训练过程中是如何是如何将语音resize呢,从仓库下载的代码执行train.py的时候报错 RuntimeError: stack expects each tensor to be equal size, but got [227, 750] at entry 0 and [301, 750] at entry 1。您遇到过这样的问题吗?是如何解决的呢?

训练结果不稳定

你好,我下载代码进行训练时发现相同参数重复跑两次所得的结果相差比较大。
对于ocsoftmax损失,最好的情况在eval上EER为2%,最差的情况是3%+,(其他损失函数也存在这种情况)
我也考虑了随机数的影响,但是训练到最后稳定下来正常情况应该是不会相差很大的。结果相差这么大是不是不正常,不知道你是否也有这样的情况。

Pre-trained models

Hi,

Hope all is well and thank you for a well-written paper!

Are you planning to open-source your pre-trained models as well?

Thanks,
Johannes

使用oc-softmax之后得分问题

为什么使用oc-softmax之后得分,正样本的得分是再-1左右,而负样本是在1左右。但是使用softmax,得分确是正样本在1左右,负样本在0左右?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.