Comments (18)
FYI, the minimal example had run successfully before i tried with my dataset
from simpletransformers.
Is that all of your data? The labels need to be 0, ..., n
where n
is the number of labels.
from simpletransformers.
sorry for not being clear! Nope! data set is like 360k rows. I just put first 4 rows for sample. Copy pasting the label value_counts again here to get a feeling about label distribution
df_train.label.value_counts()
3 212925
2 71273
0 9883
1 5920
Name: label, dtype: int64
from simpletransformers.
My bad, I didn't notice you'd included the value counts in the original comment.
I can't spot any obvious issues here that would cause this error. Are you using the latest version of Simple Transformers and PyTorch?
from simpletransformers.
Here are the various package versions.
Pytorch= 1.3.0
SimpleTransformers=0.4.5
Tensorflow=1.14.0
Thanks again for looking into it!
from simpletransformers.
Those seem fine. What version of Transformers are you using?
This error is usually caused by having more labels than num_labels and/or having a label greater than or equal to num_labels. But I can't see either of those cases in your data. Does it work if you use the AG News dataset as in the Medium article?
from simpletransformers.
my huggingface transformers version is 2.1.1. Let me try on the AG news dataset and revert shortly
from simpletransformers.
Same error with AG News data set as well... Pasting below
import pandas as pd
train_df = pd.read_csv('train.csv', header=None)
train_df['text'] = train_df.iloc[:, 1] + " " + train_df.iloc[:, 2]
train_df = train_df.drop(train_df.columns[[1, 2]], axis=1)
train_df.columns = ['label', 'text']
train_df = train_df[['text', 'label']]
train_df['text'] = train_df['text'].apply(lambda x: x.replace('\', ' '))
eval_df = pd.read_csv('test.csv', header=None)
eval_df['text'] = eval_df.iloc[:, 1] + " " + eval_df.iloc[:, 2]
eval_df = eval_df.drop(eval_df.columns[[1, 2]], axis=1)
eval_df.columns = ['label', 'text']
eval_df = eval_df[['text', 'label']]
eval_df['text'] = eval_df['text'].apply(lambda x: x.replace('\', ' '))
eval_df['label'] = eval_df['label'].apply(lambda x:x-1)
train_df.head()
text label
0 Wall St. Bears Claw Back Into the Black (Reute... 3
1 Carlyle Looks Toward Commercial Aerospace (Reu... 3
2 Oil and Economy Cloud Stocks' Outlook (Reuters... 3
3 Iraq Halts Oil Exports from Main Southern Pipe... 3
4 Oil prices soar to all-time record, posing new... 3
train_df.label.value_counts(
... )
4 30000
3 30000
2 30000
1 30000
Name: label, dtype: int64
train_df.dtypes
text object
label int64
dtype: objectfrom simpletransformers.model import TransformerModel
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/jbabu/.local/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
model = TransformerModel('roberta', 'roberta-base', num_labels=4)
model = TransformerModel('roberta', 'roberta-base', num_labels=4, args={'learning_rate':1e-5, 'num_train_epochs': 2, 'reprocess_input_data': True, 'overwrite_output_dir': True})
model.train_model(train_df)
Converting to features started.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120000/120000 [00:15<00:00, 7962.47it/s]
Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.
Defaults for this optimization level are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Epoch: 0%| | 0/2 [00:00<?, ?it/s/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:106: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes
failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:106: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes
failed.
THCudaCheck FAIL file=/pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu line=110 error=710 : device-side assert triggered
Current iteration: 0%| | 0/15000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
File "/home/jbabu/.local/lib/python3.7/site-packages/simpletransformers/model.py", line 142, in train_model
global_step, tr_loss = self.train(train_dataset, output_dir, show_running_loss=show_running_loss)
File "/home/jbabu/.local/lib/python3.7/site-packages/simpletransformers/model.py", line 367, in train
outputs = model(**inputs)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/jbabu/.local/lib/python3.7/site-packages/transformers/modeling_roberta.py", line 340, in forward
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 916, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/usr/local/lib/python3.7/dist-packages/apex/amp/wrap.py", line 28, in wrapper
return orig_fn(*new_args, **kwargs)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/functional.py", line 2009, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/usr/local/lib/python3.7/dist-packages/apex/amp/wrap.py", line 28, in wrapper
return orig_fn(*new_args, **kwargs)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/functional.py", line 1838, in nll_loss
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:110
from simpletransformers.
train_df['label'] = train_df['label'].apply(lambda x:x-1)
I think this line was missing in the Medium article. Can you try it with it included?
The train_df value counts should have the labels 0, 1, 2, 3
.
from simpletransformers.
Hi @ThilinaRajapakse
By making the labels 0,1,2,3 , the medium article is working!
Also when i just passed the data frame with only 'label' and 'text', my data set worked as well!
Thanks a lot for your swift responses!
btw, do you have any tips/document available to how to tune the parameters efficiently?
Thanks,
Jiby
from simpletransformers.
Great to hear that you got it to work!
i just passed the data frame with only 'label' and 'text', my data set worked as well
I'll look into this. It should still work even if you have more columns.
Unfortunately, hyperparameter tuning is still largely trial and error but I can give a couple of pointers that may be useful. For Transformers, 2-4 training epochs are usually sufficient. From my experience, good learning rates are usually 1e-4 to 5e-5 range. Those are still rough estimates, but they work as a starting point.
Of course, there's no guarantee that these tips will be effective in all cases (or even in most cases)!
from simpletransformers.
Another issue came up when i tried to load the pre-trained model and predict. I was able to load the model but prediction/eval failed. Please find the below errors
model = TransformerModel('bert', 'outputs/')
model.eval_model(df_test, f1=f1_multiclass, acc=accuracy_score)
Traceback (most recent call last):
File "", line 1, in
File "/home/jbabu/.local/lib/python3.7/site-packages/simpletransformers/model.py", line 176, in eval_model
self._move_model_to_device()
File "/home/jbabu/.local/lib/python3.7/site-packages/simpletransformers/model.py", line 513, in _move_model_to_device
self.model.to(self.device)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 426, in to
return self._apply(convert)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
param_applied = fn(param)
File "/home/jbabu/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 424, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: device-side assert triggered
from simpletransformers.
Did you follow the same procedure for the evaluation dataset as for the training dataset? The labels in the evaluation dataset needs to be the same as the labels in the training dataset.
from simpletransformers.
yup its the same! FYI, model.predict("sample text") is working fine. model.eval_model is the one which is throwing out this issue
from simpletransformers.
What happens if you try running eval on the training dataset itself?
from simpletransformers.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
from simpletransformers.
@ThilinaRajapakse hi
after trained model, i want to eval model with f1_score
so, when I add to eval_model(train_df, f1 = f1_score)
get the error
ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
it's true
the next, added average
inside f1_score like eval_model(train_df, f1=f1_score(average='micro')
and get the next error:
f1_score() missing 2 required positional arguments: 'y_true' and 'y_pred'
=> how & when i should provide average='micro'
?
from simpletransformers.
You need to wrap the f1_score
function from sklearn in your own function with the correct arguments.
def f1_score_micro(y_true, y_pred):
return f1_score(y_true, y_pred, average="micro")
from simpletransformers.
Related Issues (20)
- ValueError: multiclass-multioutput is not supported HOT 6
- save_steps not working , checkpoint getting generated on every epoch morethan once HOT 2
- MMBT no longer supported by huggingface transformers. HOT 1
- Memory Leak Issue HOT 1
- `calculate_results` appears to only use the first correct answer, not all correct answers
- `ValueError` occurred with `representation_model`
- Update of macos to Sonoma 14.0 breaks train_model when using 'mps' device HOT 1
- Convert generate model into .tflite for token classification
- RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'
- argmax sliding_window tiebreaker bug
- Error with prediction using multilabel classification
- RoBERTa model Prediction taking Infinite time with no error but Multiprocessing disabled warning
- saving pytorch_model.bin instead of safetensors HOT 1
- Unable to do hyperparamters search on sentence-pair classification(regression) task
- change threshold value in predict for multi-label classification? HOT 1
- Binary classification .predict raises a ValueError: could not broadcast input array from shape (2,2) into shape (1,2) HOT 3
- Problem with the absence of attention_mask when using sliding_window
- Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted']. HOT 4
- Default learning rate on pretrained layers?
- Loading a fine tuned Seq2Seq MarianMT model gives wrong predictions HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from simpletransformers.