Code Monkey home page Code Monkey logo

xserpy's Introduction

Xserpy

Python implementation of Shift-Reduce semantic parser: http://ceur-ws.org/Vol-1180/CLEF2014wn-QA-XuEt2014.pdf

Instructions for running

Run each script with -h parameter to see a list of required parameters. These usually consist of path to data files, training/testing mode(trn/tst), size of dataset and operating mode. For scripts where training a model is involved, number of iterations is usually used. Description of what each operating mode of each script does follows. Words in brackets in file names are variable parameters. Files not mentioned have main methods only for testing their methods which other scripts import. All scripts require Free917 dataset, training and testing split.

Annotate

annotator

  • l – annotate all words in dataset with phrase detection labels; creates labels_(trn/tst)_(size).pickle and questions_(trn/tst)_(size).pickle files.
  • b – labels all words in dataset using bootstraping; creates labels_(trn/tst)_(size).pickle and questions_(trn/tst)_(size).pickle files

Phrase_detection

feature_constructor

  • p – tags all questions in dataset with part-of-speech tags; creates pos_tagged_(trn/tst).pickle file
  • n – tags all questions in dataset with NER tags; creates ner_tagged_(trn/tst).pickle file
  • i – creates features for phrase detection for all questions in the dataset. Requires pos_tagged_*.pickle, ner_tagged_*.pickle and questions_(trn/tst)_(size).pickle files; creates phrase_detect_features_(trn/tst)_(size)_arr.pickle file

feature_constructor

  • l – creates training examples for phrase detection model training. Requires labels_(trn/tst)_(size).pickle and phrase_detect_features_(trn/tst)_(size).pickle files; creates phr_detect_examples_(trn/tst)_(size).pickle and empty_weights_(trn/tst)_(size).pickle files.
  • t – trains phrase detection model. Requires phr_detect_examples_(trn/tst)_(size).pickle and empty_weights_(trn/tst)_(size).pickle files; creates w_(size)_(iterations).pickle file
  • e – computes error of a model on a testing set. Requires w_641_(iterations).pickle, labels_tst_276.pickle, questions_tst_276.pickle and phrase_detect_features_tst_276_arr.pickle files.

Shift_reduce

shift_reduce

Requires labels_(trn/tst)_(size).pickle, questions_(trn/tst)_(size).pickle and pos_tagged_(trn/tst).pickle files

  • c – creates training examples for shift-reduce model training. Requires gold_dags_(trn/tst)_(size).pickle file; creates dag_examples_(trn/tst)_(size).pickle, gold_sequences_(trn/tst)_(size).pickle and empty_weights_dag_(trn/tst)_(size).pickle files.
  • t – trains shift-reduce model. Requires dag_examples_(trn/tst)_(size).pickle and empty_weights_dag_(trn/tst)_(size).pickle files; creates w_dag(size)_(iterations).pickle file
  • b – computes error of a model on a testing set. Requires w_dag641_(iterations).pickle, gold_dags_tst_276.pickle and gold_sequences_tst_276.pickle files.

Query_intention

entity_linking

Requires questions_(trn/tst)_(size).pickle

  • e – obtain candidates for entity linking through Google Freebase API; creates candidates_(trn/tst)_(size).pickle file
  • g – obtain correct entities for linking. Requires candidates_(trn/tst)_(size).pickle and query_gold_ent_(trn/tst).pickle files; creates gold_entities_(trn/tst)_(size).pickle file
  • f – construct features for entity linking. Requires gold_entities_(trn/tst)_(size).pickle and candidates_(trn/tst)_(size).pickle files; creates candidates_features_(trn/tst)_(size).pickle and ent_labels_(trn/tst)_(size).pickle files
  • t – train model for entity linking. Requires candidates_features_(trn+tst)_(size).pickle and ent_labels_(trn+tst)_(size).pickle files (4 total); creates ent_lr_trn_641.pickle file
  • r – construct features for relation linking and train model. Requires query_gold_rel_trn.pickle file; creates relation_lr_trn_641.pickle file
  • u – evaluate model for relation linking. Requires relation_lr_trn_641.pickle, query_gold_rel_tst.pickle and rel_dict.pickle files
  • l – construct features for edge linking. Requires query_gold_edges_(trn/tst).pickle and gold_dags_(trn+tst)_(size).pickle files. Creates edge_features_(trn/tst).pickle and edge_labels_(trn/tst).pickle files
  • d – train model for edge linking. Requires edge_features_(trn+tst).pickle and edge_labels_(trn+tst).pickle files (4 total); creates edge_lr_trn.pickle file
  • a – link all questions to KB. Requires all linking models and candidates_(trn/tst)_(size).pickle file.
  • q – parse logic formulas to linked DAGs; creates query_gold_rel_(trn/tst).pickle, query_gold_ent_(trn/tst).pickle, query_gold_dags_(trn/tst).pickle and query_gold_edges_(trn/tst).pickle files
  • c – create vocabularies for edges and relations

Question_conversion

convert_question

Requires all models, dictionaries, free917_(trn/tst)_answers.txt and pos_tagged_*.pickle files

  • i – answers questions input by user
  • f – answers questions from file
  • a – answers questions from dataset and evaluates them on gold standard answers

Credits

xserpy's People

Contributors

matulma4 avatar pasky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.