Code Monkey home page Code Monkey logo

Comments (25)

 avatar commented on August 15, 2024 1

I also change alpha to constant alpha

from wav2lip_288x288.

 avatar commented on August 15, 2024

I just missed 1 line from my code, I just copied it. Thank you

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

I just missed 1 line from my code, I just copied it. Thank you

I've just guessed it 10 minutes ago) gt is ground truth, so g must be fake I thought
thanks)

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

I just missed 1 line from my code, I just copied it. Thank you

I think you've missed some imports too, autograd for example

from wav2lip_288x288.

 avatar commented on August 15, 2024

ok, imported

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

I also change alpha to constant alpha

wasserstein_loss(y_pred, y_true) does not seem to be used anywhere
I should use it instead of perceptual_loss = -pred.mean()?

Without gradient penalty and wass loss (last commit) my tesla v100 was able to handle batch size 16, but now only bs 8 works
Maybe something is not deleted from VRAM?

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

I also change alpha to constant alpha

wloss_hq_wav2lip_train.py seems to be constructed from different parts of your work, but it's not obvious to make them work together
for example, I can't understand how gradient penalty affects the training process
and wasserstein loss is not used in the code
cosine loss must be changed too I think, because loss sometimes is negative because of using PReLu

from wav2lip_288x288.

 avatar commented on August 15, 2024

I implement based on the original paper, in your case, I need to modify more

from wav2lip_288x288.

 avatar commented on August 15, 2024

you should use euclid distance instead of cosine, maybe solve your problem, I've encountered it

from wav2lip_288x288.

 avatar commented on August 15, 2024

to understand more about wasserstein loss, read this code: https://github.com/EmilienDupont/wgan-gp/blob/master/training.py

from wav2lip_288x288.

 avatar commented on August 15, 2024

I missed mofidied model block, I will public soon

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

to understand more about wasserstein loss, read this code: https://github.com/EmilienDupont/wgan-gp/blob/master/training.py

Thanks a lot for your advice, I will try using euclid distance
Oh, this code makes it much clearer. If I've understood right, in our case I should split training for critical model (it's wav2lip) and for generator (it's visual quality discriminator). But where I should place syncnet penalting? Only use syncnet in resulting loss with corresponding weight?

from wav2lip_288x288.

 avatar commented on August 15, 2024

don't penalty your syncnet, you need to focus on quality of image and then focus on lip-sync. it will easier to cover 2 tasks at the same time

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

don't penalty your syncnet, you need to focus on quality of image and then focus on lip-sync. it will easier to cover 2 tasks at the same time

You've misunderstood me :) I mean that we use SyncNet as kind of penalty for wav2lip, we guide wav2lip to produce lip synced results
So where should I place that?
I've encountered a problem with SyncNet while training previous Wav2liphq. If SyncNet was included in training process too late I wasn't able to get sync loss lower than 2.0
If it was included too early - sync loss wass good, but percep loss didn't go down lower than 0.72
Maybe you have some recipe for best moment to include Syncnet in training process?
And some advice about disc weight and syncnet weight. By default it's 0.07 and 0.03
Changing those weights affects training, but I can't find a regularity. Maybe you've encountered playing with those weights

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

don't penalty your syncnet, you need to focus on quality of image and then focus on lip-sync. it will easier to cover 2 tasks at the same time

Hello, sorry for disturbing you again

I am implementing wgan-gp like training for wav2lip-hq
All goes ok, but I have a question
We find generator loss like this:
d_generated = self.D(generated_data)
g_loss = - d_generated.mean()

How to append syncnet loss correctly?
sync_loss = syncnet(mel, generated_data)
g_loss = -d_generated.mean() * 0.97 + sync_loss * 0.03

is that correct? I am not sure

from wav2lip_288x288.

 avatar commented on August 15, 2024

it's up to your distribution of your dataset. as you known critic (discriminator) tried to maximize distance between real and fake and generator is the opposite

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

it's up to your distribution of your dataset. as you known critic (discriminator) tried to maximize distance between real and fake and generator is the opposite

with gradient penalty it takes up very large amount of VRAM
I think something excess is stored in the calculation graph, but I don't know exactly what it is
and what I can safely drop from memory
Maybe you have some tips how to minimize VRAM consumption?

from wav2lip_288x288.

 avatar commented on August 15, 2024

delete tensor every epoch

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

delete tensor every epoch

epoch? maybe you mean each iteration?

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

delete tensor every epoch

here's my wgan-gp wav2lip graphs... something goes wrong, But I don't have an idea where should I dig, maybe you've encountered smth like that?

image
image

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

delete tensor every epoch

My wgan gp for wav2lip completely fails to train, searching for the reason
By the way, did your implementation of wav2lip GAN with WLoss and GP performed better, than the original implementation?

if yes, how much quality it added to reslut? is that implementation reasonable?

from wav2lip_288x288.

 avatar commented on August 15, 2024

I cannot share my work at here because my commitment, anyway wgan-gp can improve your result. It just improved your image quality. You need to find a way to schedule weight of syncnet, I cann't share all my work. Thanks

from wav2lip_288x288.

NikitaKononov avatar NikitaKononov commented on August 15, 2024

I cannot share my work at here because my commitment, anyway wgan-gp can improve your result. It just improved your image quality. You need to find a way to schedule weight of syncnet, I cann't share all my work. Thanks

I understand you
But I didn't ask you to share your full work or codes
My training fails not because of syncnet, I don't know where to dig
Can you tell what learning rate of D and G did you use, if it's not a commercial secret? And what critical interval
Maybe my problem is there

from wav2lip_288x288.

aishoot avatar aishoot commented on August 15, 2024

delete tensor every epoch

My wgan gp for wav2lip completely fails to train, searching for the reason By the way, did your implementation of wav2lip GAN with WLoss and GP performed better, than the original implementation?

if yes, how much quality it added to reslut? is that implementation reasonable?

I trained the model failed too using wloss_hq_wav2lip_train.py. All losses were 0.

from wav2lip_288x288.

 avatar commented on August 15, 2024

follow my processing dataset

from wav2lip_288x288.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.