dhgrs / chainer-clarinet Goto Github PK
View Code? Open in Web Editor NEWA Chainer implementation of ClariNet.
A Chainer implementation of ClariNet.
Hi,
I have trained both teacher/student networks.
Teacher has very nice quality although it takes way too long to generate (1s takes ~6 minutes on gpu).
Student results sound a bit robotic and have "noise" where should be silence.
Do you have any tips on that? (I haven't changed any parameters besides path to trained teacher model).
UPDATE:
Here I my examples for 2 short (~1s) files:
clarinet-results.zip
Thanks.
Hello
I would like to experiment with 48kHz sampling frequency.
What kind of parameter setting is good?
I also want to put the acoustic features from outside the script.
Is there any good way to do it?
Hi,
the Clarinet paper mentions also Text-to-Wave Architecture for end-to-end TTS.
Do you have any suggestions what would I need to do full TTS once the student network is trained?
Should I use some pre-trained model to produce mel-spetrograms like Tacotron2 or DeepVoice3? Or something else entirely?
Thanks!
Hi,
is there a reason why
StudentGaussianIAF/teacher_params.py
has length=12000
and
StudentGaussianIAF/params.py
has length=24000
while all other preprocessing params stay the same?
Thanks.
Hi,
Thanks for sharing.
It seems that there is an extra sampling step after the student network at https://github.com/dhgrs/chainer-ClariNet/blob/master/StudentGaussianIAF/net.py#L58 to get student samples. However, Algorithm 1
in https://arxiv.org/pdf/1807.07281.pdf uses z from the last IAF flow as student samples. What is the motivation of making such change?
Thanks,
Jian
Hi,
I have trained AutoregressiveWaveNet using command from README.md
python train.py -g 0
Then I have tried to generate audio using command from README.md
python generate.py -i ../../LJSpeech-1.1/wavs/LJ001-0001.wav -o result.wav -m 2018_09_27_16_03_22/snapshot_iter_500000 -g 0
Generated audio was completely silent, do you have any tip what could have gone wrong?
Thanks.
Hello
I would like to experiment with 48kHz sampling frequency.
What kind of parameter setting is good?
I also want to put the acoustic features from outside the script.
Is there any good way to do it?
Hello.
I would like to experiment with another acoustic features already extracted from speechs.
I am trying it in several ways, but for now it does not work well.
Could you tell me if there is any good way?
Hello
I would like to experiment with 48kHz sampling frequency.
What kind of parameter setting is good?
I also want to put the acoustic features from outside the script.
Is there any good way to do it?
I want to analysis the samples, but seems that they cannot be downloaded from nana-music. Is there any chance you can share your samples in a different way?
Thanks,
Hi,
Do you plan subj?
thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.