Code Monkey home page Code Monkey logo

Comments (8)

Alexius08 avatar Alexius08 commented on June 23, 2024

Editing wav_path in hparams.py turned out to be the key. I ran into another problem, however. The training scripts refuse to run at all using my new datasets. For the new ones, their respective versions of train_dataset.pkl only contains the following: €�]”. No idea what is causing this.

from forwardtacotron.

Alexius08 avatar Alexius08 commented on June 23, 2024

Got past that error message by setting all wav files to 16 bits and 22050 Hz, but I ran into another error:

Traceback (most recent call last):
  File "train_tacotron.py", line 192, in <module>
    trainer.train(model, optimizer)
  File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\trainer\taco_trainer.py", line 37, in train
    self.train_session(model, optimizer, session)
  File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\trainer\taco_trainer.py", line 57, in train_session
    for i, (x, m, ids, x_lens, mel_lens) in enumerate(session.train_set, 1):
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
    data = self._next_data()
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 474, in _next_data
    index = self._next_index()  # may raise StopIteration
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 427, in _next_index
    return next(self._sampler_iter)  # may raise StopIteration
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 227, in iter
    for idx in self.sampler:
  File "C:\Users\Alexius08\Documents\GitHub\ForwardTacotron\utils\dataset.py", line 268, in iter
    binned_idx = np.stack(bins).reshape(-1)
  File "<array_function internals>", line 5, in stack
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 422, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

The generated train_dataset.pkl, val_dataset.pkl and text_dict.pkl files don't have line breaks at all.

from forwardtacotron.

cschaefer26 avatar cschaefer26 commented on June 23, 2024

Hi, did you also change the data path in hparams? Because otherwise it would probably mix two datasets. The error message indicates that there is no training file to be loaded. I would double check whether the wav file names match the ids in the metafile.csv (if you run the preprocess.py it should say something about how many files are used).
The train_dataset.pkl is binary pickled, if you want to have a look at it you would need to load it with the unpickle_binary() function in utils - probably makes sense for debuggin. You could also have a look in data/mel and see if any files are there.

As for the other questions:

When generating sentences, how could I use a specific model when generating?

You can switch by using the --hp_file and --tts_weights flag for the corresponding models. If your models differ in hyperparams you would need to save the different hparams.py files somewhere. If the hparams are the same just setting the --tts_weights to the ***_weights.pyt model should be enough.

Do the results of previous trainings in previous models affect training new models?

No.

If I add new audio samples to one of my datasets and preprocess it again, would the training for the model start from the
beginning again or could it pick up from where it left before new samples were added?

If you don't change the tts_model_id in hparams.py it is going to resume training the previous model, otherwise it creates a new directory with the new tts_model_id under checkpoints.

from forwardtacotron.

Alexius08 avatar Alexius08 commented on June 23, 2024

Just checked the binary pickled files. The training data for my first custom dataset is an empty array, while the training data for my second custom dataset, as well as the value dataset and text dictionary in both datasets, look normal (an array of tuples containing a filename and a three digit number for the value data and the normally-generated training data, and a massive object that paired each filename with the IPA equivalent of the text transcripts in the text dictionary). Meanwhile, the mel, quant, and raw_pitch folders each has one .npy file for every wav file in the dataset, while the phon_pitch folder for both datasets are empty.

from forwardtacotron.

cschaefer26 avatar cschaefer26 commented on June 23, 2024

In this case it seems to me that there is a mismatch of text ids and wav file names, because it is only taking into account files that are matching. Did you check this? I.e. you could debug in the preprocess.py file and check how many files are filtered at lint 86. The stemmed wav file names should match the id in the metafile (e.g. 00001|some text. corresponds to 00001.wav)

from forwardtacotron.

Alexius08 avatar Alexius08 commented on June 23, 2024

When running preprocess.py, there's no mismatch. The number of files found equals the number of indexed files. However, when I added more clips to the two datasets, running tacotron.py went smoothly for one of the datasets (40 minutes split across 370 clips), while I still got an error message with the other (29 minutes split across 250 clips). Perhaps dataset size has something to do with these errors.

from forwardtacotron.

Alexius08 avatar Alexius08 commented on June 23, 2024

Also, I had to change lines 39 and 40 on my copy of train_tacotron.py to point it to my dataset's pickle files. Left unchanged, it kept trying to access the LJSpeech alignment files.

from forwardtacotron.

cschaefer26 avatar cschaefer26 commented on June 23, 2024

Good point, I will changes the scripts to take into account the hparams setting. I honestly mostly leave the data naming the same and make copies of the dataset if i train a new model. Could you solve the issue with the smaller dataset? I'm not sure what you mean by adding clips to the dataset, you would have to preprocess the whole dataset again if you add clips... (otherwise its not generating the correct train_dataset.pkl file).

from forwardtacotron.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.