Code Monkey home page Code Monkey logo

hykas's Introduction

Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering

This repository contains the code for the paper "Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering". See full paper here

Enviroments

This code has been tested on Python 3.6.7, Pytorch 1.2 and Transformers 2.3.0

Dataset

Please download the CommonsenseQA dataset from the official website

ConceptNet Knowledge Extraction

Go to directory Extraction, then download ConceptNet from the official website and uncompress in the current directory. Then simple run:

python extract_english.py 
python extract4commonsenseqa.py

This would generate the files in the data directory. To run extraction on datasets other than CommonsenseQA, you will need to modify the data loading and formatting accordingly.

ATOMIC Knowledge Generation

Note: If you only need to train models on CommonsenseQA, you can skip this part.

First clone the comet official repo and put files in comet-generation under scripts/interactive Then follow the instructions from official repo to download necessary data/models to run generation

HyKAS Model Training

CUDA_VISIBLE_DEVICES=0 python run_csqa.py --data_dir data/ --model_type roberta-ocn-inj --model_name_or_path 
roberta-large --task_name csqa-inj --cache_dir downloaded_models --max_seq_length 80 --do_train --do_eval 
--evaluate_during_training --per_gpu_train_batch_size 4 --gradient_accumulation_steps 8 --learning_rate 1e-5 
--num_train_epochs 8 --warmup_steps 150 --output_dir workspace/hykas

For CommonsenseQA, the Dev accuracy should get around 79%.

Cite

@inproceedings{ma-etal-2019-towards,
    title = "Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering",
    author = "Ma, Kaixin and Francis, Jonathan and Lu, Quanyang and Nyberg, Eric and Oltramari, Alessandro",
    booktitle = "Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-6003",
    doi = "10.18653/v1/D19-6003",
    pages = "22--32",
}

hykas's People

Contributors

mayer123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hykas's Issues

--model_name_or_path: error

Hello
when I run
CUDA_VISIBLE_DEVICES=0 python run_csqa.py --data_dir data/ --model_type roberta-ocn-inj --model_name_or_path
roberta-large --task_name csqa-inj --cache_dir downloaded_models --max_seq_length 80 --do_train --do_eval
--evaluate_during_training --per_gpu_train_batch_size 4 --gradient_accumulation_steps 8 --learning_rate 1e-5
--num_train_epochs 8 --warmup_steps 150 --output_dir workspace/hykas

I get a error: argument --model_name_or_path: expected one argument
zsh: command not found: roberta-large

How can I fix it?

The train data does not have choice_commonsense key

I ran extract4commonsenseqa.py and it generates dev_cs.jsonl which contains all the dev instances with choice_commonsense key in json format. During training, does the code accept choice_commonsense to be present in training data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.