Code Monkey home page Code Monkey logo

controlled-text-generation's Introduction

Controlled Text Generation

Reproducing Hu, et. al., ICML 2017's "Toward Controlled Generation of Text" in PyTorch. This work is for University of Bonn's NLP Lab project on Winter Semester 2017/2018.

Requirements

  1. Python 3.5+
  2. PyTorch 0.3
  3. TorchText https://github.com/pytorch/text

How to run

  1. Run python train_vae.py --save {--gpu}. This will create vae.bin. Essentially this is the base VAE as in Bowman, 2015 [2].
  2. Run python train_discriminator --save {--gpu}. This will create ctextgen.bin. The discriminator is using Kim, 2014 [3] architecture and the training procedure is as in Hu, 2017 [1].
  3. Run test.py --model {vae, ctextgen}.bin {--gpu} for basic evaluations, e.g. conditional generation and latent interpolation.

Difference compared to the paper

  1. Only conditions the model with sentiment, i.e. no tense conditioning.
  2. Entirely using SST dataset, which has only ~2800 sentences after filtering. This might not be enough and leads to overfitting. The base VAE in the original model by Hu, 2017 [1] is trained using larger dataset first.
  3. Obviously most of the hyperparameters values are different.

References

  1. Hu, Zhiting, et al. "Toward controlled generation of text." International Conference on Machine Learning. 2017. [pdf]
  2. Bowman, Samuel R., et al. "Generating sentences from a continuous space." arXiv preprint arXiv:1511.06349 (2015). [pdf]
  3. Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). [pdf]

controlled-text-generation's People

Contributors

tobiaslee avatar wiseodd avatar yonathansantosa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

controlled-text-generation's Issues

enc_input

This line seems to assume that the text between sos and eos is always long enough such that there is no padding at the end of the sentence. Why?

KL divergence

can you give some explanations of the KL divergence term? I am a little bit confused
kl_loss = torch.mean(0.5 * torch.sum(torch.exp(logvar) + mu**2 - 1 - logvar, 1))
Thank you so much!

Optimizing generate_soft_embed() and sample_soft_embed()

I'm wondering how to optimize the functions generate_soft_embed() and sample_soft_embed() in model.py. Right now, we loop over the mini-batch sequentially, which is extremely slow. On the other hand, there doesn't seem to be an intuitive way to parallelize this. Any ideas?

AttributeError: module 'msgpack._unpacker' has no attribute 'unpack'

I tried to run train_vae.py

Traceback (most recent call last):
  File "train_vae.py", line 40, in <module>
    dataset = SST_Dataset()
  File "/n/w1-bjayakumar/Others_Models/controlled-text-generation/ctextgen/dataset.py", line 8, in __init__
    self.TEXT = data.Field(init_token='<start>', eos_token='<eos>', lower=True, tokenize='spacy', fix_length=16)
  File "./.conda/envs/py36/lib/python3.6/site-packages/torchtext/data/field.py", line 150, in __init__
    self.tokenize = get_tokenizer(tokenize)
  File "./.conda/envs/py36/lib/python3.6/site-packages/torchtext/data/utils.py", line 12, in get_tokenizer
    spacy_en = spacy.load('en')
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/__init__.py", line 15, in load
    return util.load_model(name, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 112, in load_model
    return load_model_from_link(name, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 129, in load_model_from_link
    return cls.load(**overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/data/en/__init__.py", line 12, in load
    return load_model_from_init_py(__file__, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 173, in load_model_from_init_py
    return load_model_from_path(data_path, meta, **overrides)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 156, in load_model_from_path
    return nlp.from_disk(model_path)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/language.py", line 653, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
    reader(path / key)
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/language.py", line 641, in <lambda>
    self.vocab.from_disk(p) and _fix_pretrained_vectors_name(self))),
  File "vocab.pyx", line 380, in spacy.vocab.Vocab.from_disk
  File "vectors.pyx", line 391, in spacy.vectors.Vectors.from_disk
  File "./.conda/envs/py36/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
    reader(path / key)
  File "vectors.pyx", line 369, in spacy.vectors.Vectors.from_disk.load_key2row
  File "vectors.pyx", line 370, in spacy.vectors.Vectors.from_disk.load_key2row
  File "./.conda/envs/py36/lib/python3.6/site-packages/msgpack_numpy.py", line 179, in unpack
    return _unpacker.unpack(stream, encoding=encoding, **kwargs)
AttributeError: module 'msgpack._unpacker' has no attribute 'unpack'

Create a new iterator everytime?

It seems that lines like this create a new iterator every time the function is called? If so, this will not correctly transverse the epoch and also super slow when the corpus is large. I feel the right way to do it might be to create an static iterator self.train_iterator = iter(self.train_iter) at initialization time, and only next(self.train_iterator) in next_batch.

How to handle inputs with variable length

Hi @wiseodd, thanks for your open-source code. I find the forward_encoder_embed function in the model.py module cannot support inputs with variable length. In your code, it seems that you assume that each of the input sentences is of length max_sent_len 15, as you alway use the last hidden state to encode the paded sentence.

when running train_discriminator.py ,occuring cudnn RNN backward can only be called in training mode

Traceback (most recent call last):
File "train_discriminator.py", line 172, in
main()
File "train_discriminator.py", line 140, in main
loss_G.backward()
File "/remote-home/yrchen/anaconda3/envs/py37_cuda8/lib/python3.7/site-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/remote-home/yrchen/anaconda3/envs/py37_cuda8/lib/python3.7/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cudnn RNN backward can only be called in training mode

Bug in optimizer usage

In train_discriminator.py, lines 78-79:

trainer_G = optim.Adam(model.encoder_params, lr=lr)
trainer_E = optim.Adam(model.decoder_params, lr=lr)

Shouldn't the names be opposite? I.e. trainer_E should refer to encoder's params and trainer_G should refer to decoder params. I guess I'm finding it a little unintuitive, that's all.

Have you made the model work yet?

Hi,

I have seen two repos working on reproducing the results of this paper, but none of them is finished. It seems difficult and some details are not given in the paper.

Can I ask about your progress?

Many thanks.

AttributeError: module 'torch' has no attribute 'argmax'

I tried to run python train_discriminator.py

But
/home/lab303/anaconda2/envs/py36/bin/python /home/lab303/controlled-text-generation/train_discriminator.py
0%| | 0/5000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/lab303/controlled-text-generation/train_discriminator.py", line 164, in
main()
File "/home/lab303/controlled-text-generation/train_discriminator.py", line 88, in main
target_c = torch.argmax(c_gen, dim=1)
AttributeError: module 'torch' has no attribute 'argmax'

Requirements

  1. Python 3.5+
  2. PyTorch 0.3
  3. TorchText https://github.com/pytorch/text

But

PyTorch 0.3 has no attribute 'argmax'

Help me, please.

Is the gradient for Encoder doubled?

I notice there are two backpropagations for the generator and encoder.

https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L120-L122
https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L130-L132

After the back-propagation of loss G, it runs zero_grad to clear all the grads of the generator in the auto-encoder. However, the encoder is also in the forward path, and its gradient preserves. Then it computes the encoder loss and back-propagate again. So the gradient of VAE loss is accumulated twice and the final value is doubled. Is my understanding correct here?

Input labels in train_vae.py

Line 59 in train_vae.py:
inputs, labels = dataset.next_batch(args.gpu)

Here, the inputs are the sentences and labels are sentiments. However, this is just a VAE, so shouldn't the labels be the same as the inputs?

weird error when training on gpu

Traceback (most recent call last):
File "train_discriminator.py", line 308, in
main()
File "train_discriminator.py", line 239, in main
loss_G.backward()
File "/home-nfs/yangc1/anaconda3/envs/speech/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home-nfs/yangc1/anaconda3/envs/speech/lib/python3.6/site-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: backward_input can only be called in training mode

Hi, this code works fine on CPU. However, a weird error occurs doing GPU training. I have checked the model is on training mode as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.