Comments (7)
I figured out what the problem is. I was running fine tuning with a max_seq_length
of 512 while the BERTweet model was trained with 130. Once I used sequence length less than 130, it worked. I asked for a feature request for transformers to assert the seq size is less than max_position_embedding
. See huggingface/transformers#10015
from bertweet.
It's not a gpu problem. I tried running on the cpu, it also crashes with the following:
***** Running training *****
Num examples = 15383
Num Epochs = 1
Instantaneous batch size per device = 4
Total train batch size (w. parallel, distributed & accumulation) = 8
Gradient Accumulation steps = 1
Total optimization steps = 1923
0%| | 0/1923 [00:00<?, ?it/s]terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL Error 1: unhandled cuda error
from bertweet.
I upgraded to latest pytorch (1.7.1), same issue.
from bertweet.
Please can you try a newer transformers version ?
from bertweet.
I have no idea what happened.
You might also try to delete/remove BERTweet from your transformers
folder in ~/.cache/torch, so it'd automatically re-download BERTweet properly.
from bertweet.
Sure, I can try that as well. Meanwhile, I ran in interactive mode on a gpu and I managed to get better errors (haven't looked into why this happens):
Traceback (most recent call last):
File "../models/jigsaw/tr-3.4//run_puppets.py", line 284, in <module>
main()
File "../models/jigsaw/tr-3.4//run_puppets.py", line 195, in main
trainer.train(
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/trainer.py", line 756, in train
tr_loss += self.training_step(model, inputs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/trainer.py", line 1056, in training_step
loss = self.compute_loss(model, inputs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/trainer.py", line 1080, in compute_loss
outputs = model(**inputs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/modeling_roberta.py", line 990, in forward
outputs = self.roberta(
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/modeling_roberta.py", line 674, in forward
embedding_output = self.embeddings(
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/dccstor/redrug_ier/envs/attack/lib/python3.8/site-packages/transformers/modeling_roberta.py", line 121, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: CUDA error: device-side assert triggered
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [616,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
from bertweet.
I am not sure the error comes from BERTweet: indexSelectLargeIndex: block: [616,0,0], thread: [96,0,0] Assertion
srcIndex < srcSelectDimSize failed.
from bertweet.
Related Issues (20)
- Can't load Tokenizer HOT 1
- Some emojis not tokenized properly HOT 5
- using model for local tweets author prediction HOT 1
- What are pre-processing steps applied HOT 1
- Use model output for sentiment classifcation HOT 2
- AutoTokenizer gives error HOT 1
- Using with BERTweet with Farm HOT 1
- Applying Bertweet to a huge pandas dataframe HOT 1
- Reproducing the POS Tagger Results using Huggingface Tokenizer Offsets HOT 2
- How to get the dependency parsing result using BERTweet HOT 1
- Preprocessing of tweets HOT 1
- What is the masked token in BERTweet? HOT 1
- Question about normalization=True HOT 1
- Truncated Tweets from Archive Team Tweet Stream HOT 1
- About the tokenizer for Bertweet-Large HOT 1
- next sentence prediction HOT 1
- vinai/bertweet-large returns LABEL_0 all the time HOT 1
- Tokenizer vinai/bertweet-covid19-base-uncased HOT 3
- IndexError: index out of range in self HOT 2
- Sentimental analysis of tweets. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bertweet.