Code Monkey home page Code Monkey logo

structured_tuning_srl's People

Contributors

svivek avatar t-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

structured_tuning_srl's Issues

Fine-tuning on a personal dataset starting from "tli8hf/robertabase-structured-tuning-srl-conll2012" model or similar

Thank you for publishing this work.
I have a personal dataset (CoNLL-2012 format based on a modified version of PropBank Frames) and I would like to use your project to make SRL on it. In particular, I would like to fine-tune the "tli8hf/robertabase-crf-conll2012" or the "tli8hf/robertabase-structured-tuning-srl-conll2012" model on my dataset. I'm new to this task.
How can I do it?
Thanks.

Downloading propbank frames

Thank you for publishing this work.

I am attempting to follow the instructions as listed in the README.md.

The first instruction is First make sure propbank frames are downloaded, however, there are no directions on downloading the propbank frames in the readme or paper that I see.

Can you specify the address from which the propbank data is to be extracted and downloaded?

how to download treebank_3

When preprocessing conll05 datasets, instruction "./make_conll2005_data.sh ../data/treebank_3/" requires treebank_3 datasets, which is not available. How can I download this dataset?

Preprocessing results doesn't match with SOTA paper

Hi, thank you for your great paper.

I'm experimenting with your codes and comparing your preprocessing against a SOTA paper. I'm using the same dataset. What surprised me is that your preprocessed result differs from theirs. For example, for the following conll12 sentence:

bc/cctv/00/cctv_0002   18    0               So     RB         (TOP(S(ADVP*)        -    -   -    Liu_jiangyong          *      *    (ARGM-DIS*)            *             *            *          -
bc/cctv/00/cctv_0002   18    1                ,      ,                    *         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18    2             this     DT                 (NP*)        -    -   -    Liu_jiangyong          *      *        (ARG0*)            *             *            *        (49)
bc/cctv/00/cctv_0002   18    3             will     MD                 (VP*         -    -   -    Liu_jiangyong          *      *    (ARGM-MOD*)            *             *            *          -
bc/cctv/00/cctv_0002   18    4             have     VB                 (VP*       have  01   -    Liu_jiangyong          *    (V*)            *             *             *            *          -
bc/cctv/00/cctv_0002   18    5        surpassed    VBN                 (VP*    surpass  01   1    Liu_jiangyong          *      *           (V*)            *             *            *          -
bc/cctv/00/cctv_0002   18    6             what     WP    (SBAR(WHNP(WHNP*))        -    -   -    Liu_jiangyong          *      *        (ARG1*        (ARG2*)            *            *          -
bc/cctv/00/cctv_0002   18    7               it    PRP               (S(NP*)        -    -   -    Liu_jiangyong          *      *             *        (ARG1*)            *            *          -
bc/cctv/00/cctv_0002   18    8               is    VBZ                 (VP*         be  01   1    Liu_jiangyong          *      *             *           (V*)            *            *          -
bc/cctv/00/cctv_0002   18    9              now     RB               (ADVP*)        -    -   -    Liu_jiangyong          *      *             *    (ARGM-TMP*)            *            *          -
bc/cctv/00/cctv_0002   18   10              for     IN                 (PP*         -    -   -    Liu_jiangyong          *      *             *    (ARGM-ADV*             *            *          -
bc/cctv/00/cctv_0002   18   11            Japan    NNP                 (NP*         -    -   -    Liu_jiangyong       (GPE)     *             *             *        (ARG1*       (ARG0*    (53|(33)
bc/cctv/00/cctv_0002   18   12              and     CC                    *         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   13            China    NNP                   *))        -    -   -    Liu_jiangyong       (GPE)     *             *             *)            *)           *)   (44)|53)
bc/cctv/00/cctv_0002   18   14            which    WDT          (SBAR(WHNP*)        -    -   -    Liu_jiangyong          *      *      (C-ARG1*             *      (R-ARG1*)    (R-ARG0*)         -
bc/cctv/00/cctv_0002   18   15            still     RB          (S(VP(ADVP*)        -    -   -    Liu_jiangyong          *      *             *             *    (ARGM-TMP*)           *          -
bc/cctv/00/cctv_0002   18   16           failed    VBD                    *       fail  01   1    Liu_jiangyong          *      *             *             *           (V*)           *          -
bc/cctv/00/cctv_0002   18   17               to     TO               (S(VP*         -    -   -    Liu_jiangyong          *      *             *             *        (ARG2*            *          -
bc/cctv/00/cctv_0002   18   18            reach     VB                 (VP*      reach  01   1    Liu_jiangyong          *      *             *             *             *          (V*)         -
bc/cctv/00/cctv_0002   18   19             200      CD              (NP(QP*         -    -   -    Liu_jiangyong    (MONEY*      *             *             *             *       (ARG1*          -
bc/cctv/00/cctv_0002   18   20          billion     CD                    *)        -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   21               US    NNP                    *         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   22          dollars    NNS                    *)        -    -   -    Liu_jiangyong          *)     *             *             *             *            *)         -
bc/cctv/00/cctv_0002   18   23               at     IN                 (PP*         -    -   -    Liu_jiangyong          *      *             *             *             *   (ARGM-TMP*          -
bc/cctv/00/cctv_0002   18   24             this     DT              (NP(NP*         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   25             33rd     JJ                    *         -    -   -    Liu_jiangyong   (ORDINAL)     *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   26      anniversary     NN                    *)        -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   27               of     IN                 (PP*         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   28              the     DT              (NP(NP*         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   29    normalization     NN                    *)        -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   30               of     IN                 (PP*         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   31            their   PRP$                 (NP*         -    -   -    Liu_jiangyong          *      *             *             *             *            *        (53)
bc/cctv/00/cctv_0002   18   32       diplomatic     JJ                    *         -    -   -    Liu_jiangyong          *      *             *             *             *            *          -
bc/cctv/00/cctv_0002   18   33        relations    NNS   *))))))))))))))))))        -    -   -    Liu_jiangyong          *      *            *))            *             *)           *)         -
bc/cctv/00/cctv_0002   18   34                .      .                   *))        -    -   -    Liu_jiangyong          *      *             *             *             *            *          -

Your preprocessed result consists of 4 predicate-arg pairs (conll2012.train.txt, line 28617 to 28620):

4 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O B-V O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O ||| 01 
8 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O B-ARG2 B-ARG1 B-V B-ARGM-TMP B-ARGM-ADV I-ARGM-ADV I-ARGM-ADV I-ARGM-ADV O O O O O O O O O O O O O O O O O O O O O ||| 01 
16 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O O O O O O B-ARG1 I-ARG1 I-ARG1 B-R-ARG1 B-ARGM-TMP B-V B-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 O ||| 01 
18 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O O O O O O B-ARG0 I-ARG0 I-ARG0 B-R-ARG0 O O O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 B-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP O ||| 01 

But theirs produce 5:

O O O O B-V O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
B-ARGM-DIS O B-ARG0 B-ARGM-MOD O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 B-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 O
O O O O O O B-ARG2 B-ARG1 B-V B-ARGM-TMP B-ARGM-ADV I-ARGM-ADV I-ARGM-ADV I-ARGM-ADV O O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O B-ARG1 I-ARG1 I-ARG1 B-R-ARG1 B-ARGM-TMP B-V B-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 O
O O O O O O O O O O O B-ARG0 I-ARG0 I-ARG0 B-R-ARG0 O O O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 B-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP O

Also, the conll sentence has 5 N:ARGS columns too. Seems that the second predicate B-ARGM-DIS O B-ARG0 B-ARGM-MOD O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 B-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 O is missing in your preprocessed result.

I'm new to this task and don't know much about it. Could you clarify this issue?
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.