utahnlp / structured_tuning_srl Goto Github PK
View Code? Open in Web Editor NEWImplementation of our ACL 2020 paper: Structured Tuning for Semantic Role Labeling
License: Apache License 2.0
Implementation of our ACL 2020 paper: Structured Tuning for Semantic Role Labeling
License: Apache License 2.0
When preprocessing conll05 datasets, instruction "./make_conll2005_data.sh ../data/treebank_3/" requires treebank_3 datasets, which is not available. How can I download this dataset?
Thank you for publishing this work.
I am attempting to follow the instructions as listed in the README.md.
The first instruction is First make sure propbank frames are downloaded
, however, there are no directions on downloading the propbank frames in the readme or paper that I see.
Can you specify the address from which the propbank data is to be extracted and downloaded?
Hi, thank you for your great paper.
I'm experimenting with your codes and comparing your preprocessing against a SOTA paper. I'm using the same dataset. What surprised me is that your preprocessed result differs from theirs. For example, for the following conll12 sentence:
bc/cctv/00/cctv_0002 18 0 So RB (TOP(S(ADVP*) - - - Liu_jiangyong * * (ARGM-DIS*) * * * -
bc/cctv/00/cctv_0002 18 1 , , * - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 2 this DT (NP*) - - - Liu_jiangyong * * (ARG0*) * * * (49)
bc/cctv/00/cctv_0002 18 3 will MD (VP* - - - Liu_jiangyong * * (ARGM-MOD*) * * * -
bc/cctv/00/cctv_0002 18 4 have VB (VP* have 01 - Liu_jiangyong * (V*) * * * * -
bc/cctv/00/cctv_0002 18 5 surpassed VBN (VP* surpass 01 1 Liu_jiangyong * * (V*) * * * -
bc/cctv/00/cctv_0002 18 6 what WP (SBAR(WHNP(WHNP*)) - - - Liu_jiangyong * * (ARG1* (ARG2*) * * -
bc/cctv/00/cctv_0002 18 7 it PRP (S(NP*) - - - Liu_jiangyong * * * (ARG1*) * * -
bc/cctv/00/cctv_0002 18 8 is VBZ (VP* be 01 1 Liu_jiangyong * * * (V*) * * -
bc/cctv/00/cctv_0002 18 9 now RB (ADVP*) - - - Liu_jiangyong * * * (ARGM-TMP*) * * -
bc/cctv/00/cctv_0002 18 10 for IN (PP* - - - Liu_jiangyong * * * (ARGM-ADV* * * -
bc/cctv/00/cctv_0002 18 11 Japan NNP (NP* - - - Liu_jiangyong (GPE) * * * (ARG1* (ARG0* (53|(33)
bc/cctv/00/cctv_0002 18 12 and CC * - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 13 China NNP *)) - - - Liu_jiangyong (GPE) * * *) *) *) (44)|53)
bc/cctv/00/cctv_0002 18 14 which WDT (SBAR(WHNP*) - - - Liu_jiangyong * * (C-ARG1* * (R-ARG1*) (R-ARG0*) -
bc/cctv/00/cctv_0002 18 15 still RB (S(VP(ADVP*) - - - Liu_jiangyong * * * * (ARGM-TMP*) * -
bc/cctv/00/cctv_0002 18 16 failed VBD * fail 01 1 Liu_jiangyong * * * * (V*) * -
bc/cctv/00/cctv_0002 18 17 to TO (S(VP* - - - Liu_jiangyong * * * * (ARG2* * -
bc/cctv/00/cctv_0002 18 18 reach VB (VP* reach 01 1 Liu_jiangyong * * * * * (V*) -
bc/cctv/00/cctv_0002 18 19 200 CD (NP(QP* - - - Liu_jiangyong (MONEY* * * * * (ARG1* -
bc/cctv/00/cctv_0002 18 20 billion CD *) - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 21 US NNP * - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 22 dollars NNS *) - - - Liu_jiangyong *) * * * * *) -
bc/cctv/00/cctv_0002 18 23 at IN (PP* - - - Liu_jiangyong * * * * * (ARGM-TMP* -
bc/cctv/00/cctv_0002 18 24 this DT (NP(NP* - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 25 33rd JJ * - - - Liu_jiangyong (ORDINAL) * * * * * -
bc/cctv/00/cctv_0002 18 26 anniversary NN *) - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 27 of IN (PP* - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 28 the DT (NP(NP* - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 29 normalization NN *) - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 30 of IN (PP* - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 31 their PRP$ (NP* - - - Liu_jiangyong * * * * * * (53)
bc/cctv/00/cctv_0002 18 32 diplomatic JJ * - - - Liu_jiangyong * * * * * * -
bc/cctv/00/cctv_0002 18 33 relations NNS *)))))))))))))))))) - - - Liu_jiangyong * * *)) * *) *) -
bc/cctv/00/cctv_0002 18 34 . . *)) - - - Liu_jiangyong * * * * * * -
Your preprocessed result consists of 4 predicate-arg pairs (conll2012.train.txt, line 28617 to 28620):
4 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O B-V O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O ||| 01
8 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O B-ARG2 B-ARG1 B-V B-ARGM-TMP B-ARGM-ADV I-ARGM-ADV I-ARGM-ADV I-ARGM-ADV O O O O O O O O O O O O O O O O O O O O O ||| 01
16 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O O O O O O B-ARG1 I-ARG1 I-ARG1 B-R-ARG1 B-ARGM-TMP B-V B-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 O ||| 01
18 So , this will have surpassed what it is now for Japan and China which still failed to reach 200 billion US dollars at this 33rd anniversary of the normalization of their diplomatic relations . ||| O O O O O O O O O O O B-ARG0 I-ARG0 I-ARG0 B-R-ARG0 O O O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 B-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP O ||| 01
But theirs produce 5:
O O O O B-V O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
B-ARGM-DIS O B-ARG0 B-ARGM-MOD O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 B-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 O
O O O O O O B-ARG2 B-ARG1 B-V B-ARGM-TMP B-ARGM-ADV I-ARGM-ADV I-ARGM-ADV I-ARGM-ADV O O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O B-ARG1 I-ARG1 I-ARG1 B-R-ARG1 B-ARGM-TMP B-V B-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 I-ARG2 O
O O O O O O O O O O O B-ARG0 I-ARG0 I-ARG0 B-R-ARG0 O O O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 B-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP I-ARGM-TMP O
Also, the conll sentence has 5 N:ARGS columns too. Seems that the second predicate B-ARGM-DIS O B-ARG0 B-ARGM-MOD O B-V B-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 I-ARG1 B-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 I-C-ARG1 O
is missing in your preprocessed result.
I'm new to this task and don't know much about it. Could you clarify this issue?
Thank you.
Thank you for publishing this work.
I have a personal dataset (CoNLL-2012 format based on a modified version of PropBank Frames) and I would like to use your project to make SRL on it. In particular, I would like to fine-tune the "tli8hf/robertabase-crf-conll2012" or the "tli8hf/robertabase-structured-tuning-srl-conll2012" model on my dataset. I'm new to this task.
How can I do it?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.