rafaelvalle / asrgen Goto Github PK

View Code? Open in Web Editor NEW

34.0 34.0 6.0 92.39 MB

Attacking Speaker Recognition with Deep Generative Models

Home Page: https://arxiv.org/pdf/1801.02384.pdf

Python 4.85% Jupyter Notebook 95.15%

adversarial-attacks asr gans text-to-speech

asrgen's People

Contributors

Stargazers

Watchers

Forkers

zhaoyj1122 nature1317 entn-at htzhang25 ming0818 twistedmove

asrgen's Issues

Generate target samples

Thank you for your contribution, I have some doubts in the experiment, I hope you can answer.
First question:
In gan_synthesis.ipynb，

audio = load_wav_to_torch('data_16khz/zcathy/cathy.wav', SAMPLING_RATE)
audio /= MAX_WAV_VALUE
audio = audio[None, :]
reference_mel = taco_stft.mel_spectrogram(audio)[0]
print(reference_mel.min(), reference_mel.max())

mel -= mel.min()
mel = mel / mel.max()
mel = mel * reference_mel.max()
print(mel.min(), mel.max())**

Is mel = mel * reference_mel.max() the matching of the generated fake audio with the real audio?
I don't quite understand how to use the trained G_NET to generate the voiceprint audio that matches the target.

Second question:
Is gan_attack.ipynb a target attack?
The target ID you set is 0. Can this be modified and replaced with another ID?

Looking forward to your reply!

TypeError: Cannot handle this data type

‘python python gan_train.py ’
An error has occurred

File "", line 1, in
runfile('D:/Documents/paper/asrgen-master/asrgen-master/gan_train.py', wdir='D:/Documents/paper/asrgen-master/asrgen-master')

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "D:/Documents/paper/asrgen-master/asrgen-master/gan_train.py", line 140, in
logger.log_validation(real_data_spk, real_data_spk+reg_noise,fake_data, iteration)

File "D:\Documents\paper\asrgen-master\asrgen-master\logger.py", line 33, in log_validation
iteration)

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\tensorboardX\writer.py", line 548, in add_image
image(tag, img_tensor, dataformats=dataformats), global_step, walltime)

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\tensorboardX\summary.py", line 216, in image
image = make_image(tensor, rescale=rescale)

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\tensorboardX\summary.py", line 256, in make_image
image = Image.fromarray(tensor)

File "D:\Program Files\Anconda3\envs\tensorflow\lib\site-packages\PIL\Image.py", line 2492, in fromarray
raise TypeError("Cannot handle this data type")

TypeError: Cannot handle this data type

How can I fix it ?

About the boundary between real_data_spk and real_data_nspk

Sorry,I can't understand why the 82nd (or 94th)line in gan_train.py uses BATCH_SIZE instead of SAMPLE_SIZE. Because in my view,when the 62nd line in gan_train.py uses SAMPLE_SIZE,we actually get train_generator with 2*SAMPLE_SIZE.Then the boundary in the 82nd line should be SAMPLE_SIZE.Where is wrong?I'm not good at it.Thanks sincerely.

About the dataset

Can you tell me which datasets should be used to train the speaker recognition system while there is a folder named 'data_16khz', and I doubt it contains 100 speakers of 2004 NIST and 1 speaker of 2013 Blizzard and I don't know is this enough? or maybe I don't get the complete audios of each speaker which makes the accuracy of sr is extremely low? Where can I get complete dataset for the speaker recognition system.Thanks.

rafaelvalle / asrgen Goto Github PK

asrgen's People

Contributors

Stargazers

Watchers

Forkers

asrgen's Issues

Generate target samples

TypeError: Cannot handle this data type

About the boundary between real_data_spk and real_data_nspk

About the dataset

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent