jeremycchsu / vae-npvc Goto Github PK

View Code? Open in Web Editor NEW

145.0 145.0 43.0 125 KB

Re-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder

License: Other

Python 98.80% Shell 1.20%

vae-npvc's People

Contributors

Stargazers

Watchers

Forkers

roger865477 lucklady uncledickhe hudsonhuang chcbin entn-at agangzz cypress777 melspectrum007 colinsongf alexisthelarge chj1330 lewisget afcarl poka93 xujiliang garyhsu123 s0urcer zoonono kcinzgg miaoyuanyuan yusuke-kurita jungkj94 szrayic zyzisyz shaojinding zhengjunyue silyfox chunhuiwang-china inconnu11 lukelluke crystalwh mynameismaxz powei-c grailsociety wangrui1203 windowxiaoming abrueggeman zmx1165493338 chaizhi quantumraccoon ifgcguitarclub

vae-npvc's Issues

loss objective function missed -

https://github.com/JeremyCCHsu/vae-npvc/blob/master/model/vae.py#L128
according to paper, loss['G'] = logPx - D_KL??

Issue with NHWC format

As you said my tensorflow is working on cpu , so I changed format='NHWC' still getting same error:Conv2DCustomBackpropInputOp only supports NHWC.
so kindly tell me in which file I need to change.

Results on other datasets

The results on VCTK dataset are good.However when i use it for my own dataset the reconstructions are very noisy and inaudible.Any suggestions?

How to convert nchw ops to nhwc ops?

Hi @JeremyCCHsu

Could you guide me on this convertion? I have no idea which ops are nchw, and which are nhwc.

Thank you.

你好!请教你一个问题

我想使用中文数据进行训练,需要改什么吗?还是说直接把数据集改了就可以,其余的代码都不用改

How to use change model from VAE to VAWGAN?

Hi,

I am trying to do non-parallel voice conversion, with VAE option I am able to do voice conversion. But when I changed model to VAWGAN, it is giving below error.

python main.py --model VAWGAN --trainer VAWGANTrainer --architecture architecture-vawgan-vcc2016.json

FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "main.py", line 40, in
MODEL = getattr(module, args.model)
AttributeError: module 'model.vae' has no attribute 'VAWGAN'

How to generate *. npf

root@vultr:~/vae-npvc# python3 main.py --model ConvVAE --trainer VAETrainer --architecture architecture-vae-vcc2016.json
Using default logdir: logdir/train/0114-1444-11-2019
Traceback (most recent call last):
File "main.py", line 78, in
main()
File "main.py", line 58, in main
xmax=np.fromfile('./etc/xmax.npf'),
FileNotFoundError: [Errno 2] No such file or directory: './etc/xmax.npf'

Compiler tips can not find the file, can you answer, beginners do not understand

issue while training model

I am working on non-parallel voice conversion using VAWGAN model. While training model I am facing above error, so can you help me in fixing this issue.
Is it because of tf.make_template() or else template.py file

Is there a pre-trained model here?

the logPx log_var is zero

i am curious anout https://github.com/JeremyCCHsu/vae-npvc/blob/master/model/vae.py#L124
logPx = tf.reduce_mean(
GaussianLogDensity(
slim.flatten(x),
slim.flatten(xh),
tf.zeros_like(slim.flatten(xh))),
if log_var is constant 0, leading to loss GaussianLogDensity is equivalent to MSE？

About reading data

In “main.py”, training data are got by " image, label = read(***)" for once. But I found the function "read()" in analyzer.py can only return a batch-size samples and there is no circulation for getting other samples. Thus, only a batch-size samples are used to train the network. Do I understand your code correctly? If yes, is it the right way to read training data?

Problem with convert.py

Hi Jeremy, I am running convert.py and get this error:
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [7,1,1,16] rhs shape= [7,1,513,16]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@Encoder/Conv2d-0/Conv2d-0/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](Encoder/Conv2d-0/Conv2d-0/kernel, save/RestoreV2_1)]]

It is because I use CPU, so I change all the NCHW to NHWC to support CPU version.

Dimensions aren`t equal

An error named "ValueError: Dimensions must be equal, but are 513 and 41553 for 'loss/GaussianLogDensity/sub' (op: 'Sub') with input shapes: [16,513], [16,41553]."was occured ,when I run main.py using ConvVAE,VAETrain.Besides,my environment is py36,tf1.2 on cpu.Could you help me fix the problem?Thank you.

dataset download link seems not working

here is the message I've got.

--2017-12-14 11:32:50-- http://datashare.is.ed.ac.uk/download/10283/2042/SUPERS EDED_-The_Voice_Conversion_Challenge_2016.zip
Resolving datashare.is.ed.ac.uk (datashare.is.ed.ac.uk)... 129.215.41.53
Connecting to datashare.is.ed.ac.uk (datashare.is.ed.ac.uk)|129.215.41.53|:80... connected.
HTTP request sent, awaiting response... 302 Found : Moved Temporarily
Location: https://datashare.is.ed.ac.uk/download/10283/2042/SUPERSEDED_-_The_Voi ce_Conversion_Challenge_2016.zip [following]
--2017-12-14 11:32:52-- https://datashare.is.ed.ac.uk/download/10283/2042/SUPER SEDED-_The_Voice_Conversion_Challenge_2016.zip
Connecting to datashare.is.ed.ac.uk (datashare.is.ed.ac.uk)|129.215.41.53|:443.. . connected.
HTTP request sent, awaiting response... 404 Not Found
2017-12-14 11:32:52 ERROR 404: Not Found.

unzip: cannot find or open SUPERSEDED_-The_Voice_Conversion_Challenge_2016.zip , SUPERSEDED-The_Voice_Conversion_Challenge_2016.zip.zip or SUPERSEDED-_The_V oice_Conversion_Challenge_2016.zip.ZIP.
unzip: cannot find or open vcc2016_training.zip, vcc2016_training.zip.zip or vc c2016_training.zip.ZIP.
mv: cannot stat ‘vcc2016_training’: No such file or directory
unzip: cannot find or open evaluation_all.zip, evaluation_all.zip.zip or evalua tion_all.zip.ZIP.
rm: cannot remove ‘evaluation_all.zip’: No such file or directory
rm: cannot remove ‘vcc2016_training.zip’: No such file or directory

maybe support chinese what i can do ?

tensorflow.python.framework.errors_impl.InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
[[{{node Update/gradients/loss/Encoder/Conv2d-4/Conv2d-4/Conv2D_grad/Conv2DBackpropInput}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "F:/work/vae-npvc-master/main.py", line 78, in
main()
File "F:/work/vae-npvc-master/main.py", line 74, in main
trainer.train(nIter=arch['training']['max_iter'], machine=machine)
File "F:\work\vae-npvc-master\trainer\vae.py", line 99, in train
sess.run(self.opt['g'])
File "F:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "F:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "F:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)

where can listen to demo?

as the title, where can i listen to the demo?

May I ask you why did you clean up W-GAN model?

I see there's wgan model in earlier commit, but now cleaned, how may I reproduce your result?