hslcy / glossbert Goto Github PK

View Code? Open in Web Editor NEW

91.0 91.0 18.0 14.49 MB

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge (EMNLP 2019)

Home Page: https://arxiv.org/pdf/1908.07245.pdf

License: MIT License

Java 2.46% Python 97.45% Shell 0.09%

glossbert's People

Contributors

Stargazers

Watchers

Forkers

hamidshojanazeri daniellin94144 poonamspatil greitzmann yyzhuang1991 qianchu emiride bilalghanem christat13 zhoudoufu subburajs itisbean robpruzan trellixvulnteam ream-kh kashmot2 blackanana sandy4321

glossbert's Issues

Demo Bug

F.softmax is being called on the SequenceClassifierOutput object, not the tensor

Class imbalance

Hi,

Thanks for the code. How are you dealing with class imbalance problem. There will be vastly more gloss sentence pair that have label 0 then that have label 1.

Last Checkpoint

Can you please write steps on how to use the last checkpoint?

Label Names

Hi! I was wondering in what order are yes/no appear in the model output. Is the first logit for yes or for no?

和前两种方法（GlossBERT(Token-CLS)和GlossBERT(Sent-CLS)）相比，GlossBERT(Sent-CLS-WS)是读取弱监督的数据的，请问它提取特征是怎么处理的呢？和前面两种方法的不同是在哪里呢？求赐教

Pretrained Models

Hello,

are there plans to release pretrained models? The model is very expensive to train, and the research community would really benefit from having them available.

Thanks in advance.

请问index.sense 和index.sense.gloss是从wordnet哪里得到的？

您好，我看了Wordnet的网站，并没有找到这两个文件，请问这两个文件是从哪里得到的？可以说一下每一列的意义是什么吗？

如何理解论文中的弱监督信号

感谢分享，想问一下文中的弱监督信号（引号等）有没有相关的理论依据和相似的研究？谢谢

How much memory does the program need?

Hello,
Would you tell me how much memory is needed for training GlossBERT using your example setting?
while I train the model, it always shows RuntimeError: CUDA out of memory.
My gpu is 1080Ti 10G.

error:
Traceback (most recent call last):
File "run_classifier_WSD_sent.py", line 706, in
main()
File "run_classifier_WSD_sent.py", line 520, in main
logits = model(input_ids=input_ids, token_type_ids=segment_ids, attention_mask=input_mask, labels=None)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 972, in forward
_, pooled_output = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 716, in forward
output_all_encoded_layers=output_all_encoded_layers)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 395, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 380, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 338, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 298, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 10.92 GiB total capacity; 10.20 GiB already allocated; 23.38 MiB free; 30.57 MiB cached)

issues regarding the overall performance Table 3

We have checked the POS performance listed in Table 3 but the statistics do not add up. The overall performance is 77.6 (not 77.0) if we add all correct instances from each POS, shown in the following table, please check, THX.

	N	V	A	R	ALL
	4300	1652	955	346	7253
SE07	159	296			455
	4141	1356	955	346	6798
	79.8	67.1	79.6	87.4	77
	3304.518	909.876	760.18	302.404	5276.978
					0.776254

StopIteration exception?

Hi, have you run into the same problem?
I'm training on 8 16GB V100 GPU but got this exception (even only use 1):

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.