joonson / voxceleb_unsupervised Goto Github PK
View Code? Open in Web Editor NEWAugmentation adversarial training for self-supervised speaker recognition
Augmentation adversarial training for self-supervised speaker recognition
Hi, thank you for your amazing work. I'm wondering whether there is an instruction for loading both the image face frames as well as the speech segments.
File "/nfs/spk/voxceleb_unsupervised/DatasetLoader.py", line 66, in __getitem__
audio = loadWAVSplit(self.data_list[index], self.max_frames).astype(numpy.float)
File "/nfs/spk/voxceleb_unsupervised/DatasetLoader.py", line 198, in loadWAVSplit
raise e
File "/nfs/spk/voxceleb_unsupervised/DatasetLoader.py", line 194, in loadWAVSplit
startframe = random.sample(range(0, randsize), 2)
File "/nfs/project/tools/anaconda2/lib/python3.6/random.py", line 320, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
Hello, thank you for good resource.
I guess, in the paper, reversing gradients is only done when the embedding training phase.
However, in the source code, it seems that the discriminator training phase also goes through the GRL layer, so I wonder why.
Thanks for the perfect job. Can you describe how to get 1000 pre-computed RIR filters specifically?
Thanks for your work. When I reproduce your paper, i can't achieve the result in your paper: AP+Nosie+RIR eer=9.56%,
AP+AAT+Noise+RIR=8.65%, I only get the result: AP+Nosie+RIR eer=11.78%, AP+AAT+Noise+RIR=10.08%. Is there anything else I need to pay attention to during training?
def gen_echo(ref, rir, filterGain):
rir = numpy.multiply(rir, pow(10, 0.1 * filterGain))
echo = signal.convolve(ref, rir, mode='full')[:len(ref)]
return echo
in this function, rir data type is float32, but ref data type is int16.
can convolve the different data type data?
Is there any problem in terms of signal processing?
Hello,
for ii in range(0,it-1):
if ii % args.test_interval == 0:
clr = s.updateLearningRate(args.lr_decay)
It seems to be one more learning rate decay when ii is zero.
Thank you.
What's the best unsupervised method do you know? And does anyone reproduce the AAT result?
Hi. Thank you very much for the good material!
In your code, I can't find the augmentation part including both the RIR and noise employed in your paper.
In addition, I'm wondering if the resource of the Augmentation Adversarial Training will be released or not.
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.