Code Monkey home page Code Monkey logo

hed-dlg's Introduction

hed-dlg

Hierarchical Encoder Decoder for Dialog Modelling

Creating A Dataset

The script convert-text2dict.py can be used to generate model datasets based on text files with dialogues. It is assumed that each dialogue consists of three turns: A-B-A.

Prepare your dataset as a text file for with one dialogue (one triple) per line. There must be exactly three utterances in each dialogue, and they must be separated by the tab symbol. There must no be any tab symbols elsewhwere in the file. The dialogues are assumed to be tokenized. If you have a validation and tests sets, they must satisfy the same requirements.

Once you're ready, you can create the model dataset files by running:

python convert-text2dict.py <training_file> --cutoff <vocabulary_size> Training python convert-text2dict.py <validation_file> --dict=Training.dict.pkl Validation python convert-text2dict.py <test_file> --dict=Training.dict.pkl <vocabulary_size> Test

where <training_file> is the training file, and <vocabulary_size> is the number of tokens that you want to train on (all other tokens will be converted to symbols).

Training The Model

If you have Theano with GPU installed (bleeding edge version), you can train the model as follows:

  1. Clone the Github repository
  2. Create a new "Output" and "Data" directories inside it.
  3. Unpack your dataset files into "Data" directory.
  4. Create a new prototype inside state.py (look at prototype_moviedic or prototype_test as examples)
  5. From the terminal, cd into the code directory and run:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python train.py --prototype <prototype_name> &> Model_Output.txt

For a 13M word dataset, such as MovieTriples, this takes about 1-2 days until it reaches convergence.

To test the model afterwards, you can run:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python evaluate.py --exclude-sos --plot-graphs Output/<model_name> --document_ids Data/Test_Shuffled_Dataset_Labels.txt &> Model_Evaluation.txt

where <model_name> is the name automatically generated by train.py.

If your GPU runs out of memory, you can adjust the bs (batch size) parameter inside the state.py, but training will be slower. You can also play around with the other parameters inside state.py.

hed-dlg's People

Contributors

julianser avatar

Watchers

a mystery avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.