Code Monkey home page Code Monkey logo

Comments (3)

Kyubyong avatar Kyubyong commented on July 28, 2024

Technically speaking mel-scale is not exactly the same as log. See https://en.wikipedia.org/wiki/Mel_scale. The paper says they use melspectrogram and linear-scale log magnitude (spectrogram). So the spectrogram2wav converts the predicted magnitude to the waveform. It has nothing to do with melspectrogram.

The reason why people care about whether we appy logarithm to magnitude in training is two, in my opinion. First, we or at least I don't have a full understanding of why it is useful. Second, in practice it needs our attention in that there are three times of padding--reducing frames, dynamic padding, and convolution with same padding.

For now I don't know why there's a problem when you set use_log_magnitude to False. I'll check soon.

from tacotron.

Kyubyong avatar Kyubyong commented on July 28, 2024

Oh now I know the reason. For the plain magnitude we must not allow negative numbers because of the power. I simply clipped the value to zero. https://github.com/Kyubyong/tacotron/blob/master/eval.py#L70

from tacotron.

chief7 avatar chief7 commented on July 28, 2024

Well, you're right when you say that mel isn't exactly log. And to be honest, my explanation isn't much more then a first guess. I didn't check every part of the code and I agree with you: I'm not quite sure what the log stuff is all about.

But: Even after your latest commits, I can't generate non-silent audio from my model if use_log_magnitude is False

from tacotron.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.