Code Monkey home page Code Monkey logo

dgm's Introduction

Dialogue Graph Modeling for Conversational Machine Reading

This is the code for the paper Dialogue Graph Modeling for Conversational Machine Reading.

Here is a codalab bundle link to reproduce our results.

1. Requirements

(Our experiment environment for reference)

  • Python 3.6

  • Python 2.7 (for open discourse tagging tool)

  • Pytorch (1.6.0)

  • NLTK (3.4.5)

  • spacy (2.0.16)

  • transformers (2.8.0)

  • editdistance (0.5.2)

  • dgl (0.5.3)

2. Datasets

Download the dataset and extract it, or to use it directly in the directory data/sharc_raw

3. Instructions

3.1 Preprocess Data

Fixing errors in raw data
python fix_question.py
EDU segmentation

The environment requirements are listed here

cd segedu
python preprocess_discourse_segment.py
python sharc_discourse_segmentation.py
Discourse relations tagging

We need to train a discourse relation tagging model according to here. Firstly, download Glove for pretrained word vector and put it in DialogueDiscourseParsing/glove/glove.6B.100d.txt.

Secondly, preprocess data for training.

python data_pre.py <input_dir> <output_file>

Or you can directly use the data in DialogueDiscourseParsing/data/processed_data.

Then train the parser with

python main.py --train

The model should be stored in DialogueDiscourseParsing/dev_model. One can directly use the model trained here.

Finally, we can inference for ShARC dataset to get the discourse relations.

python construct_tree_mapping.py
python convert.py

cd DialogueDiscourseParsing
python main_.py
Preprocessing for Decision Making
python preprocess_decision_base.py
Preprocessing for Question Generation
python preprocess_span.py

All the preprocessed data can be found in the directory ./data. You can also download it here

3.2 Decision Making and Question Generation

To train the model on decision making subtask, run the following:

python -u train_sharc.py \
--train_batch=16 \
--gradient_accumulation_steps=2 \
--epoch=5 \
--seed=323 \
--learning_rate=5e-5 \
--loss_entail_weight=3.0 \
--dsave="out/{}" \
--model=decision_gcn \
--early_stop=dev_0a_combined \
--data=./data/ \
--data_type=decision_electra-large-discriminator \
--prefix=train_decision \
--trans_layer=2 \
--eval_every_steps=300

The trained model and corresponding results are stored in out/train_decision

For question generation subtask, we first extract the under-specified span by following:

python -u train_sharc.py \
--train_batch=16 \
--gradient_accumulation_steps=2 \
--epoch=5 \
--seed=115 \
--learning_rate=5e-5 \
--dsave="out/{}" \
--model=span \
--early_stop=dev_0_combined \
--data=./data/ \
--data_type=span_electra-large-discriminator \
--prefix=train_span \
--eval_every_steps=100

The trained model and corresponding results are stored in out/train_span

Then, use the inference result of under-specified span and the rule document to generate follow-up questions:

python -u qg.py \
--fin=./data/sharc_raw/json/sharc_dev.json \
--fpred=./out/inference_span \  # directory of span prediction
--model_recover_path=/absolute/path/to/pretrained_models/qg.bin \
--cache_path=/absolute/path/to/pretrain_models/unilm/

The final results are stored in final_res.json

Acknowledgement

Part of code is modified from the [Discern](https://github.com/Yifan-Gao/Discern) implementation.

dgm's People

Contributors

ozyyshr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.