Comments (8)
Editing wav_path in hparams.py turned out to be the key. I ran into another problem, however. The training scripts refuse to run at all using my new datasets. For the new ones, their respective versions of train_dataset.pkl only contains the following: €�]”.
No idea what is causing this.
from forwardtacotron.
Got past that error message by setting all wav files to 16 bits and 22050 Hz, but I ran into another error:
Traceback (most recent call last):
File "train_tacotron.py", line 192, in <module>
trainer.train(model, optimizer)
File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\trainer\taco_trainer.py", line 37, in train
self.train_session(model, optimizer, session)
File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\trainer\taco_trainer.py", line 57, in train_session
for i, (x, m, ids, x_lens, mel_lens) in enumerate(session.train_set, 1):
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 474, in _next_data
index = self._next_index() # may raise StopIteration
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 427, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 227, in iter
for idx in self.sampler:
File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\utils\dataset.py", line 268, in iter
binned_idx = np.stack(bins).reshape(-1)
File "<array_function internals>", line 5, in stack
File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 422, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
The generated train_dataset.pkl, val_dataset.pkl and text_dict.pkl files don't have line breaks at all.
from forwardtacotron.
Hi, did you also change the data path in hparams? Because otherwise it would probably mix two datasets. The error message indicates that there is no training file to be loaded. I would double check whether the wav file names match the ids in the metafile.csv (if you run the preprocess.py it should say something about how many files are used).
The train_dataset.pkl is binary pickled, if you want to have a look at it you would need to load it with the unpickle_binary() function in utils - probably makes sense for debuggin. You could also have a look in data/mel and see if any files are there.
As for the other questions:
When generating sentences, how could I use a specific model when generating?
You can switch by using the --hp_file and --tts_weights flag for the corresponding models. If your models differ in hyperparams you would need to save the different hparams.py files somewhere. If the hparams are the same just setting the --tts_weights to the ***_weights.pyt model should be enough.
Do the results of previous trainings in previous models affect training new models?
No.
If I add new audio samples to one of my datasets and preprocess it again, would the training for the model start from the
beginning again or could it pick up from where it left before new samples were added?
If you don't change the tts_model_id in hparams.py it is going to resume training the previous model, otherwise it creates a new directory with the new tts_model_id under checkpoints.
from forwardtacotron.
Just checked the binary pickled files. The training data for my first custom dataset is an empty array, while the training data for my second custom dataset, as well as the value dataset and text dictionary in both datasets, look normal (an array of tuples containing a filename and a three digit number for the value data and the normally-generated training data, and a massive object that paired each filename with the IPA equivalent of the text transcripts in the text dictionary). Meanwhile, the mel, quant, and raw_pitch folders each has one .npy file for every wav file in the dataset, while the phon_pitch folder for both datasets are empty.
from forwardtacotron.
In this case it seems to me that there is a mismatch of text ids and wav file names, because it is only taking into account files that are matching. Did you check this? I.e. you could debug in the preprocess.py file and check how many files are filtered at lint 86. The stemmed wav file names should match the id in the metafile (e.g. 00001|some text. corresponds to 00001.wav)
from forwardtacotron.
When running preprocess.py, there's no mismatch. The number of files found equals the number of indexed files. However, when I added more clips to the two datasets, running tacotron.py went smoothly for one of the datasets (40 minutes split across 370 clips), while I still got an error message with the other (29 minutes split across 250 clips). Perhaps dataset size has something to do with these errors.
from forwardtacotron.
Also, I had to change lines 39 and 40 on my copy of train_tacotron.py to point it to my dataset's pickle files. Left unchanged, it kept trying to access the LJSpeech alignment files.
from forwardtacotron.
Good point, I will changes the scripts to take into account the hparams setting. I honestly mostly leave the data naming the same and make copies of the dataset if i train a new model. Could you solve the issue with the smaller dataset? I'm not sure what you mean by adding clips to the dataset, you would have to preprocess the whole dataset again if you add clips... (otherwise its not generating the correct train_dataset.pkl file).
from forwardtacotron.
Related Issues (20)
- symbols.py for Arabic letters
- results dont match HOT 1
- implement hifigan vocoder?
- Adding pauses to the input text HOT 2
- confuse about duration extract HOT 10
- preprocess.py issues - RAM usage close to 100% but CPU usage is nonexistant HOT 16
- ValueError not enough values to unpack (expected 2 got 0) HOT 2
- making the system available for use with assistive technologies on windows HOT 1
- Bad Alignment HOT 1
- ValueError: need at least one array to stack train_tacotron.py line 192 HOT 1
- Facing problem at preprocessing
- Need instructions for fine tunning
- Problems with attention for dataset consisting of longer samples
- how to train a dataset using a pre-trained model?
- preprocess.py misuses Espeak backend, resulting in slow performance and memory leak HOT 2
- preprocess.py: list index out of range HOT 5
- Multispeaker and new neural voice creation HOT 12
- Non-Latin alphabets
- Bad Attention!
- Training a model twice using a different dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from forwardtacotron.