lxucs / coref-hoi Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.
License: Apache License 2.0
PyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.
License: Apache License 2.0
I have been able to run the models on the OntoNotes test set, but how do we get predictions for our own CoNLL-U files?
Hi,
I use spanbert large model with default parameters in config file, and I get Avg F1 78.27, lower than Avg.F1 79.9 in paper.
config as following:
num_docs = 2802
bert_learning_rate = 1e-05
task_learning_rate = 0.0003
max_segment_len = 512
ffnn_size = 3000
cluster_ffnn_size = 3000
max_training_sentences = 3
bert_tokenizer_name = bert-base-cased
max_top_antecedents = 50
max_training_sentences = 5
top_span_ratio = 0.4
max_num_extracted_spans = 3900
max_num_speakers = 20
max_segment_len = 256
bert_learning_rate = 1e-5
task_learning_rate = 2e-4
loss_type = marginalized # {marginalized, hinge}
mention_loss_coef = 0
false_new_delta = 1.5 # For loss_type = hinge
adam_eps = 1e-6
adam_weight_decay = 1e-2
warmup_ratio = 0.1
max_grad_norm = 1 # Set 0 to disable clipping
gradient_accumulation_steps = 1
coref_depth = 1 # when 1: no higher order (except for cluster_merging)
higher_order = attended_antecedent # {attended_antecedent, max_antecedent, entity_equalization, span_clustering, cluster_merging}
coarse_to_fine = true
fine_grained = true
dropout_rate = 0.3
ffnn_size = 1000
ffnn_depth = 1
cluster_ffnn_size = 1000 # For cluster_merging
cluster_reduce = mean # For cluster_merging
easy_cluster_first = false # For cluster_merging
cluster_dloss = false # cluster_merging
num_epochs = 24
feature_emb_size = 20
max_span_width = 30
use_metadata = true
use_features = true
use_segment_distance = true
model_heads = true
use_width_prior = true # For mention score
use_distance_prior = true # For mention-ranking score
conll_eval_path = dev.english.v4_gold_conll # gold_conll file for dev
conll_test_path = test.english.v4_gold_conll # gold_conll file for test
genres = ["bc", "bn", "mz", "nw", "pt", "tc", "wb"]
eval_frequency = 1000
report_frequency = 100
After step 1 and 2 I tried step 3 of the Basic setup
Prepare dataset (requiring OntoNotes 5.0 corpus): ./setup_data.sh /path/to/ontonotes /path/to/data/dir
.
.
reference-coreference-scorers/v8.01/test/DataFiles/TC-N.key
reference-coreference-scorers/v8.01/test/test.pl
reference-coreference-scorers/v8.01/test/TestCases.README
bash: conll-2012/v3/scripts/skeleton2conll.sh: No such file or directory
Though there exists a coref_hoi/data/dir/conll-2012/v3/scripts/skeleton2conll.sh file.
Do I need to change any other file prior to running setup_data.sh ?
Hi,
First, I want to thank you so much for your valuable efforts, and this perfectly comprehensible and clean code.
I do not know whether I should ask this here, but I ran into CUDA out of memory error in the evaluation phase (something like this: RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 7.93 GiB total capacity; 4.76 GiB already allocated; 948.81 MiB free; 6.23 GiB reserved in total by PyTorch
).
First, I ran into this error in the training phase. I reduced the size of some parameters in the experiments.conf file, which I think would help to reduce the GPU usage and they did because I am now able to pass the training phase. However, this error appears in the evaluation phase no matter how much I decrease the parameters like span width, max_sentence_len, or the ffnn size. I wonder if you had the same problem or do you have any suggestions for me.
I am currently using GeForce GTX 1080 with 8GB memory.
Many thanks,
Arad
All the data and models required have been downloaded into proper path.
Trying to run predict.py with command:
python predict.py --config_name=train_spanbert_large_ml0_d2 --model_identifier=May08_12-38-29_58000 --gpu_id=0
and encounter ValueError:
Traceback (most recent call last):
File "predict.py", line 71, in
nlp.add_pipe(nlp.create_pipe('sentencizer'))
File "/home/qliu/anaconda3/envs/e2e/lib/python3.6/site-packages/spacy/language.py", line 754, in add_pipe
raise ValueError(err)
ValueError: [E966]nlp.add_pipe
now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy.pipeline.sentencizer.Sentencizer object at 0x7f7fabe3f288> (name: 'None').
If you created your component with
nlp.create_pipe('name')
: remove nlp.create_pipe and callnlp.add_pipe('name')
instead.If you passed in a component like
TextCategorizer()
: callnlp.add_pipe
with the string name instead, e.g.nlp.add_pipe('textcat')
.If you're using a custom component: Add the decorator
@Language.component
(for function components) or@Language.factory
(for class components / factories) to your custom component and assign it a name, e.g.@Language.component('your_name')
. You can then runnlp.add_pipe('your_name')
to add it to the pipeline.
Hello, I'd to know how about the result of this model training on Bert_base? I have trianed on bert base with c2f , python run.py train_bert_base_ml0_d2, but only get a result about 67 F1
The repo does not contain any license specification. It would be great if you could license it explicitly under a FOSS license so that further research can build upon this great code!
Personally I'd suggest the MIT license but a Apache or a GPL variety could also be a great choice.
Most of these licenses require attribution in source code distributions so you would have to be credited (as you should be ๐).
Hi all,
I was wondering if it is possible to use custom data that one can prepare themselves for training this model. If so, how does one do this with coref-hoi? Will it convert a txt file to the right format or does one have to convert it to a ConLL file first? Can it be ConLL-U? Thank you very much.
Hi lxucs,
There are 2 checkpoint of the trained weights, which one is the one used in your paper?
Thanks
Below is an example:
train_spanbert_large_ml0_cm_fn1000_max_dloss/model_May14_05-15-38_63000.bin
train_spanbert_large_ml0_cm_fn1000_max_dloss/model_May22_23-31-16_66000.bin
Hi @lxucs,
I want to train a model for bert_base with no HOI like the spanbert_large_ml0_d1 model
python run.py bert_base 0
Got this issue:
Traceback (most recent call last):
File "run.py", line 289, in
model = runner.initialize_model()
File "run.py", line 51, in initialize_model
model = CorefModel(self.config, self.device)
File "/VL/space/sushantakp/research_work/coref-hoi/model.py", line 33, in init
self.bert = BertModel.from_pretrained(config['bert_pretrained_name_or_path'])
File "/VL/space/sushantakp/.conda/envs/skp_env376/lib/python3.7/site-packages/transformers/modeling_utils.py", line 935, in from_pretrained
raise EnvironmentError(msg)
OSError: Can't load weights for 'bert-base-cased'. Make sure that:
Is it needed to change any parameter in experiments.conf ?
To handle above issue
to train with HOI/ No HOI
Good work.
I only see weights for large, could you also provider weights for base? That will be much easy to handle for debugging.
Thanks.
Hi @lxucs
Please share brief information about the use of analyze.py.
Hi again Liyan,
I had some brief questions regarding splitting documents into segments. I think the segments contain more than one sentence (based on the split_into_segments function in the preprocess.py file). Was not it be better if segments contain one sentence at last? I could not see the intuition behind it. Is it better to have longer segments or it is for having more efficient use of resources? or Is it practically tested and the trained model gained better accuracy this way?
Thanks,
Arad
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.