Code Monkey home page Code Monkey logo

cnn-al-tf's Introduction

Convolutional Neural Network for Discourse Relation Sense Classification

Note: This project is mostly based on https://github.com/yuhaozhang/sentence-convnet


Requirements

To visualize the results (visualize.ipynb)

Data

  • We used Penn Discourse Treebank ver. 2.0.
    Assume that CoNLL 2016 data is stored in json format under data/conll dir.

    cnn-al-tf
    ├── ...
    ├── word2vec
    └── data
        └── conll
            ├── pdtb-dev.json
            ├── pdtb-dev-parses.json
            ├── pdtb-train.json
            └── pdtb-train-parses.json
    
  • word2vec directory is empty. Please download the Google News pretrained vector data from this Google Drive link, and unzip it to the directory. It will be a .bin file.

Usage

Preprocess

python ./util.py

It creates vocab.txt, *.ids and emb.npy files.

Training

  • Hierarchical multi-label classification with negative sampling (HML+NS):

    python ./train.py --sent_len=163 --vocab_size=34368 --num_classes=21 \
    --hierarchical=True --negative=True --use_pretrain=True
  • Hierarchical multi-label classification on split contexts with negative sampling (HML+NS+Split):

    python ./train_split.py --sent_len=100 --vocab_size=34368 --num_classes=21 \
    --hierarchical=True --negative=True --use_pretrain=True
  • Active learning with max-entropy strategy:

    python ./train_active.py --sent_len=163 --vocab_size=34368 --num_classes=21 \
    --hierarchical=True --negative=False --use_pretrain=True --strategy=max_entropy \
    --num_epochs=1 --batch_size=100 --pool_size=1000 --log_step=1 --summary_step=10

Caution: A wrong value for input-data-dependent options (sent_len, vocab_size and num_classes) may cause an error. If you want to train the model on another dataset, please check these values.

Evaluation

  • Display F1 and AUC score (overall performance)

    python ./eval.py --checkpoint_dir=./train/1473898241
  • Display classification report (class-wise performance)

    python ./predict.py --checkpoint_dir=./train/1473898241

Replace --checkpoint_dir with the output from the training.

Run TensorBoard

tensorboard --logdir=./train/1473898241

Results

P R F1 AUC
ML 0.7473 0.1360 0.2301 0.4399
ML+NS 0.7406 0.1557 0.2573 0.4370
HML 0.7722 0.1732 0.2829 0.4685
HML+NS 0.7862 0.1972 0.3153 0.4930
ML+Split 0.4932 0.0237 0.0451 0.2469
ML+NS+Split 0.4476 0.0309 0.0578 0.2156
HML+Split 0.4828 0.0486 0.0883 0.2622
HML+NS+Split 0.4732 0.0445 0.0813 0.2573

PR-Curves AUC F1 LOSS

References

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.