Code Monkey home page Code Monkey logo

ampersand-emnlp2019's People

Contributors

tuhinjubcse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ampersand-emnlp2019's Issues

Getting started with AMPERSAND

Hi,

I'm a computer science undergraduate doing my final year project on argument mining. I came across your AMPERSAND paper and would like to use it in my project to further explore argument mining. Thank you for kindly making it available ๐Ÿ‘

At the moment I would like to test its performance for other subreddits. I have read the supplementary pdf, README, and parts of the BERT documentation. However, I still have some questions about how to get started running AMPERSAND:

  1. Are the models linked in the README already finetuned? Do we need to train them ourselves before we load and use them?

  2. Could you give a concrete example of the input data's format i.e. the subreddit data. In a previous issue, it was said that
    "File format is tsv
    Sentence1\tLabel1
    Sentence1\tLabel2
    Sentence1\tLabel3"
    but what are the sentences and labels?

  3. How do I run your full pipeline system? Does the following run the input data through the entire argument component classification, relations identification, and RST, etc.?

export GLUE_DIR=/path/to/data

python run_classifier.py \
  --task_name ARG \
  --do_eval \
  --data_dir $GLUE_DIR/ \
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --output_dir /tmp/output/

Thank you very much.

How does the Argument Component Classifier work?

Hi,
I came across your EMNLP paper. I was hoping to test out your pretrained Argument Component BERT classifier on some other subreddits. Could you explain a little bit more as to how the component classification is working (asking for the results in Table 1 of the paper)?
Specifically,

  1. what is the format of the output from the BERT Argument Component Classifier? Does it predict for each token like a BIO tagging format?
  2. What files/commands do I need to run for a new text file (assume abc.txt is the text file that contains one Reddit comment per line)?

Eagerly waiting for your response.

AMPERSAND inputs and labels for Argument Component Classification and Inter-turn Relations Prediction

Hi,

Thank you very much for answering my questions in my last issue. Your answers were very helpful.

I have some further clarifying questions regarding the inputs to the Argument Component Classification and Inter-turn Relations Prediction models:

About the Argument Component Classification:

  1. Just to clarify, for each sample, is the input a single text sequence? (Not a pair of sequences?)

1a. (Side question) How does the model learn how to classify the components correctly if the context surrounding that component is not also provided to the model?
E.g. Argument 1: "Dogs are better than cats. (claim) Dogs are more affectionate and friendly. (premise)" How does the machine correctly classify "Dogs are more affectionate and friendly" as a premise if it doesn't know its context, here the entirety of Argument 1?

  1. Does AMPERSAND assume that the argument-component-boundary-detection task has already been completed? And therefore the inputs to the component classification model are argumentative sub-units that have already been determined?
    E.g. Given Argument 1, is it assumed that we have already determined "Dogs are better than cats." and "Dogs are more affectionate and friendly" as argumentative sub-units? Is it these sub-units that will then be inputted and classified by the model?

  2. If this is so and I wanted to test the model on a new unseen dataset, would I need an additional system to identify these argumentative sub-units first before inputting it into the component classification model? i.e. carry out the component-boundary detection task?

  3. How are the labels encoded for the component classification? e.g. 0 -> claim, 1 -> premise, 2-> non-arg

About Inter-turn relations prediction
5. The input for each sample is Sentence1 \t Sentence2. For training and testing, are these sentences complete arguments like Argument 1? Or are they just the claim components of two different arguments (e.g. just "Dogs are better than cats")

  1. How are the labels encoded for this relations prediction? From Figure 2 in your paper, the model outputs a 1 if there is a relation and 0 if there is no relation. If there is a relation present, how could we determine the type of relation i.e. is it support or attack?

General questions:
7. Just to clarify, is the same code in run_classifier.py used for both argument component classification and relations prediction tasks? (But for each different task though we would load in the different fine-tuned models?)

  1. Are run_lm_finetuning.py, run_squad.py, run_swag.py ever used in the AMPERSAND pipeline?

Thank you for taking the time to read this issue - I know there are a lot of questions!

Broken link

Hi there, I enjoyed reading your paper! Unfortunately the link in the README to the "Fine-tuned Pytorch Model using IMHO+Context as Intermediate Pretraining over BERT" is now broken - can you update it please?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.