Code Monkey home page Code Monkey logo

glossbert's People

Contributors

hslcy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

glossbert's Issues

Demo Bug

F.softmax is being called on the SequenceClassifierOutput object, not the tensor

Class imbalance

Hi,

Thanks for the code. How are you dealing with class imbalance problem. There will be vastly more gloss sentence pair that have label 0 then that have label 1.

Last Checkpoint

Can you please write steps on how to use the last checkpoint?

Label Names

Hi! I was wondering in what order are yes/no appear in the model output. Is the first logit for yes or for no?

Pretrained Models

Hello,

are there plans to release pretrained models? The model is very expensive to train, and the research community would really benefit from having them available.

Thanks in advance.

How much memory does the program need?

Hello,
Would you tell me how much memory is needed for training GlossBERT using your example setting?
while I train the model, it always shows RuntimeError: CUDA out of memory.
My gpu is 1080Ti 10G.

error:
Traceback (most recent call last):
File "run_classifier_WSD_sent.py", line 706, in
main()
File "run_classifier_WSD_sent.py", line 520, in main
logits = model(input_ids=input_ids, token_type_ids=segment_ids, attention_mask=input_mask, labels=None)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 972, in forward
_, pooled_output = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 716, in forward
output_all_encoded_layers=output_all_encoded_layers)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 395, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 380, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 338, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/nlplab/tjwu/anaconda3/envs/GlossBERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/nlplab/tjwu/GlossBERT/modeling.py", line 298, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 10.92 GiB total capacity; 10.20 GiB already allocated; 23.38 MiB free; 30.57 MiB cached)

issues regarding the overall performance Table 3

We have checked the POS performance listed in Table 3 but the statistics do not add up. The overall performance is 77.6 (not 77.0) if we add all correct instances from each POS, shown in the following table, please check, THX.

  N V A R ALL  
  4300 1652 955 346 7253  
SE07 159 296     455  
  4141 1356 955 346 6798  
  79.8 67.1 79.6 87.4 77  
  3304.518 909.876 760.18 302.404 5276.978  
          0.776254  

StopIteration exception?

Hi, have you run into the same problem?
I'm training on 8 16GB V100 GPU but got this exception (even only use 1):
screenshot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.