wiseodd / controlled-text-generation Goto Github PK

Reproducing Hu, et. al., ICML 2017's "Toward Controlled Generation of Text"

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

controlled-text-generation's Introduction

Controlled Text Generation

Reproducing Hu, et. al., ICML 2017's "Toward Controlled Generation of Text" in PyTorch. This work is for University of Bonn's NLP Lab project on Winter Semester 2017/2018.

Requirements

Python 3.5+
PyTorch 0.3
TorchText https://github.com/pytorch/text

How to run

Run python train_vae.py --save {--gpu}. This will create vae.bin. Essentially this is the base VAE as in Bowman, 2015 [2].
Run python train_discriminator --save {--gpu}. This will create ctextgen.bin. The discriminator is using Kim, 2014 [3] architecture and the training procedure is as in Hu, 2017 [1].
Run test.py --model {vae, ctextgen}.bin {--gpu} for basic evaluations, e.g. conditional generation and latent interpolation.

Difference compared to the paper

Only conditions the model with sentiment, i.e. no tense conditioning.
Entirely using SST dataset, which has only ~2800 sentences after filtering. This might not be enough and leads to overfitting. The base VAE in the original model by Hu, 2017 [1] is trained using larger dataset first.
Obviously most of the hyperparameters values are different.

References

Hu, Zhiting, et al. "Toward controlled generation of text." International Conference on Machine Learning. 2017. [pdf]
Bowman, Samuel R., et al. "Generating sentences from a continuous space." arXiv preprint arXiv:1511.06349 (2015). [pdf]
Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). [pdf]

controlled-text-generation's People

Contributors

Stargazers

Watchers

Forkers

shubhampachori12110095 ankitvad yonathansantosa iqbal-chowdhury yeahestherchan yiqingyang2012 capskyhook tmadhuri mourga kahiniwadhawan yinzhang809 tianqitang1 jhyuklee chandreshiit xrc10 afcarl tobiaslee johndpope xiaoanshi richardlee777 shijy07 amyzhw userxiname soudia mukami12 chenyangh lethaiq icdi0906 jianliu91 stevenlol a515151 playhing lingofunk rainyrainyguo zide05 evonloch benhoff hongbosen roholazandie aly-shmahell saratkv pengbo-o cgnarendiran barryzm xuyl0104 ug911 greengrass2015 bjavid alphasean anchit1704 nkny nayash zsy-ai akaranikolos marvosyntactical taolusi aniruddha-ju spencerx ybkim95 marscube mehdi-mirzapour newcodevelop

controlled-text-generation's Issues

enc_input

This line seems to assume that the text between sos and eos is always long enough such that there is no padding at the end of the sentence. Why?

KL divergence

can you give some explanations of the KL divergence term? I am a little bit confused
kl_loss = torch.mean(0.5 * torch.sum(torch.exp(logvar) + mu**2 - 1 - logvar, 1))
Thank you so much!

Optimizing generate_soft_embed() and sample_soft_embed()

I'm wondering how to optimize the functions generate_soft_embed() and sample_soft_embed() in model.py. Right now, we loop over the mini-batch sequentially, which is extremely slow. On the other hand, there doesn't seem to be an intuitive way to parallelize this. Any ideas?

AttributeError: module 'msgpack._unpacker' has no attribute 'unpack'

I tried to run train_vae.py

Traceback (most recent call last):
  File "train_vae.py", line 40, in <module>
    dataset = SST_Dataset()
  File "/n/w1-bjayakumar/Others_Models/controlled-text-generation/ctextgen/dataset.py", line 8, in __init__
    self.TEXT = data.Field(init_token='<start>', eos_token='<eos>', lower=True, tokenize='spacy', fix_length=16)
  File "./.conda/envs/py36/lib/python3.6/site-packages/torchtext/data/field.py", line 150, in __init__
    self.tokenize = get_tokenizer(tokenize)
  File "./.conda/envs/py36/lib/python3.6/site-packages/torchtext/data/utils.py", line 12, in get_tokenizer
    spacy_en = spacy.load('en')
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/__init__.py", line 15, in load
    return util.load_model(name, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 112, in load_model
    return load_model_from_link(name, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 129, in load_model_from_link
    return cls.load(**overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/data/en/__init__.py", line 12, in load
    return load_model_from_init_py(__file__, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 173, in load_model_from_init_py
    return load_model_from_path(data_path, meta, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 156, in load_model_from_path
    return nlp.from_disk(model_path)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/language.py", line 653, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
    reader(path / key)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/language.py", line 641, in <lambda>
    self.vocab.from_disk(p) and _fix_pretrained_vectors_name(self))),
  File "vocab.pyx", line 380, in spacy.vocab.Vocab.from_disk
  File "vectors.pyx", line 391, in spacy.vectors.Vectors.from_disk
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
    reader(path / key)
  File "vectors.pyx", line 369, in spacy.vectors.Vectors.from_disk.load_key2row
  File "vectors.pyx", line 370, in spacy.vectors.Vectors.from_disk.load_key2row
  File "./.conda/envs/py36/lib/python3.6/site-packages/msgpack_numpy.py", line 179, in unpack
    return _unpacker.unpack(stream, encoding=encoding, **kwargs)
AttributeError: module 'msgpack._unpacker' has no attribute 'unpack'

Implement Bowman et. al. 2016

Bowman, Samuel R., et al. "Generating Sentences from a Continuous Space." CoNLL 2016. [arxiv]

Create a new iterator everytime?

It seems that lines like this create a new iterator every time the function is called? If so, this will not correctly transverse the epoch and also super slow when the corpus is large. I feel the right way to do it might be to create an static iterator self.train_iterator = iter(self.train_iter) at initialization time, and only next(self.train_iterator) in next_batch.

How to handle inputs with variable length

Hi @wiseodd, thanks for your open-source code. I find the forward_encoder_embed function in the model.py module cannot support inputs with variable length. In your code, it seems that you assume that each of the input sentences is of length max_sent_len 15, as you alway use the last hidden state to encode the paded sentence.

when running train_discriminator.py ,occuring cudnn RNN backward can only be called in training mode

Traceback (most recent call last):
File "train_discriminator.py", line 172, in
main()
File "train_discriminator.py", line 140, in main
loss_G.backward()
File "/remote-home/yrchen/anaconda3/envs/py37_cuda8/lib/python3.7/site-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/remote-home/yrchen/anaconda3/envs/py37_cuda8/lib/python3.7/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cudnn RNN backward can only be called in training mode

Bug in optimizer usage

In train_discriminator.py, lines 78-79:

trainer_G = optim.Adam(model.encoder_params, lr=lr)
trainer_E = optim.Adam(model.decoder_params, lr=lr)

Shouldn't the names be opposite? I.e. trainer_E should refer to encoder's params and trainer_G should refer to decoder params. I guess I'm finding it a little unintuitive, that's all.

The accuracy of generated samples are not high, do you have any improvement suggestion?

'soft' embedding

controlled-text-generation/ctextgen/model.py

Line 370 in a09ad1d

emb = y.unsqueeze(0) @ self.word_emb.weight

this line must be 'emb = torch.mm(y.unsqueeze(0), self.word_emb.weight)'?

Gather and preprocess data as used in the paper

IMDB corpus
Sentiment corpus
Tense corpus

Extend Bowman's model with Hu's model

I.e. add discriminator + conditioning

Have you made the model work yet?

Hi,

I have seen two repos working on reproducing the results of this paper, but none of them is finished. It seems difficult and some details are not given in the paper.

Can I ask about your progress?

Many thanks.

TypeError: multinomial() missing 1 required positional arguments: "num_samples"

There is a small problem with the model.py file. Python reported an error when it goes to line 297, where there is a missing argument for the multinomial function.

AttributeError: module 'torch' has no attribute 'argmax'

I tried to run python train_discriminator.py

But
/home/lab303/anaconda2/envs/py36/bin/python /home/lab303/controlled-text-generation/train_discriminator.py
0%| | 0/5000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/lab303/controlled-text-generation/train_discriminator.py", line 164, in
main()
File "/home/lab303/controlled-text-generation/train_discriminator.py", line 88, in main
target_c = torch.argmax(c_gen, dim=1)
AttributeError: module 'torch' has no attribute 'argmax'

Requirements

Python 3.5+
PyTorch 0.3
TorchText https://github.com/pytorch/text

But

PyTorch 0.3 has no attribute 'argmax'

Help me, please.

Is the gradient for Encoder doubled?

I notice there are two backpropagations for the generator and encoder.

https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L120-L122
https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L130-L132

After the back-propagation of loss G, it runs zero_grad to clear all the grads of the generator in the auto-encoder. However, the encoder is also in the forward path, and its gradient preserves. Then it computes the encoder loss and back-propagate again. So the gradient of VAE loss is accumulated twice and the final value is doubled. Is my understanding correct here？

A wee question

Why shouldn't the "labels" here be the "c" sampled in model.generate_sentences(batch_size) (line 87)?

controlled-text-generation/train_discriminator.py

Line 96 in a09ad1d

loss_u = F.cross_entropy(y_disc_fake, labels) + beta*entropy

Input labels in train_vae.py

Line 59 in train_vae.py:
inputs, labels = dataset.next_batch(args.gpu)

Here, the inputs are the sentences and labels are sentiments. However, this is just a VAE, so shouldn't the labels be the same as the inputs?

weird error when training on gpu

Traceback (most recent call last):
File "train_discriminator.py", line 308, in
main()
File "train_discriminator.py", line 239, in main
loss_G.backward()
File "/home-nfs/yangc1/anaconda3/envs/speech/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home-nfs/yangc1/anaconda3/envs/speech/lib/python3.6/site-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: backward_input can only be called in training mode

Hi, this code works fine on CPU. However, a weird error occurs doing GPU training. I have checked the model is on training mode as well.

wiseodd / controlled-text-generation Goto Github PK

controlled-text-generation's Introduction

Controlled Text Generation

Requirements

How to run

Difference compared to the paper

References

controlled-text-generation's People

Contributors

Stargazers

Watchers

Forkers

controlled-text-generation's Issues

Recommend Projects

Recommend Topics

Recommend Org