Comments (4)
No problem!
Well, there are two options:
- Voice cloning (as you mentioned) - where you synthesize speech from a specific voice from text.
- Voice conversion - where you take audio from one speaker and directly convert it to a target speaker.
I think Real-Time-Voice-Cloning the best available open-source project for voice cloning. For voice conversion, there is https://github.com/liusongxiang/StarGAN-Voice-Conversion and https://github.com/auspicious3000/autovc for example.
Hope that helps!
from universalvocoding.
Hi @shoegazerstella,
It's fun to mess with the inputs but I think changing the speech characteristics in any systematic way is pretty difficult. I remember the issue in #3 was that changing num_fft
resulted in a pitch shift. I think a more principled method would be vocal tract length perturbation (see "Vocal tract length perturbation (VTLP) improves speech recognition" for details). It's relatively easy to mess with the mel filters in librosa so that'd be a simple place to start.
Otherwise, if you're interested in changing the speaker entirely I've done some work on voice conversion here. There are also a bunch of papers/repos that convert the spectrogram directly and then synthesize with a vocoder (happy to suggest some if you're interested).
from universalvocoding.
if you're interested in changing the speaker entirely I've done some work on voice conversion here. There are also a bunch of papers/repos that convert the spectrogram directly and then synthesize with a vocoder (happy to suggest some if you're interested).
Exacly, my aim is to change the speaker entirely.
I was reading more on voice cloning and I did find these two works:
But if I understand well, your approach on voice conversion is a little bit different. I'll look more into it!
Would be awesome if you could suggest other approaches too!
Thanks a lot!
from universalvocoding.
So yes, the approaches are two indeed.
For the TTS part I was using an implementation of FastSpeech2 and to be honest I didn't want to change that because it's super fast in CPU.
So I might try both approaches and decide on both quality of results and speed.
Again thanks a lot! :)
from universalvocoding.
Related Issues (20)
- 24kHz and 10 bit mu-law model HOT 2
- Question about preprocess.py HOT 1
- Usage of audio_slice_frames, sample_frames, pad HOT 8
- Generating samples from generated Mel-spectrograms HOT 3
- Result remains little noise, but loss does not decrease HOT 9
- Changing parameters HOT 2
- How long does it takes to train from the scratch? HOT 4
- preprocessing_mel question HOT 6
- generate_audio questions
- Why the embedding layer instead of the one-hot audio vector? HOT 1
- How to improve performance HOT 2
- audio_slice_frames in v0.2
- audio_slice_frames deprecation in v0.2 HOT 1
- Help needed. Trying to get vocoder working with output from a ML Tracotron HOT 5
- num_steps of training for those demo sample? HOT 5
- Result with other datasets HOT 1
- Inference speed comparison HOT 1
- mulaw encdoing HOT 1
- What's the capacity of this network? HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from universalvocoding.