I had a run through with this model classifying comments on a petition that had a lot

A few observations. about covid-twitter-bert HOT 2 OPEN

digitalepidemiologylab commented on September 26, 2024

A few observations.

from covid-twitter-bert.

Comments (2)

mar-muel commented on September 26, 2024

responding to the tone of 'voice' in the comments, producing strong "false" or "misleading" signals if the input text is aggressive in nature?

Sometimes a thorough error analysis is very insightful! Problematic errors are systematic errors and it's important to reveal them/know how they impact the summary statistics.

In general, your definition of "fake" might also overlap partially with "non-rational"/agitated/ALL CAPS comments. So it's worth to conduct a similar analysis on your annotation set. Larger models usually require fewer samples to get to a decent accuracy level, so you might be able to clean your annotation data a bit as well (as long as you're not introducing another bias). This usually has a positive impact on scores because your objective is clearer.

Just some thoughts - good luck with the analysis.

from covid-twitter-bert.

peregilk commented on September 26, 2024

Following up on Martins comments here.

Firstly, the COVID-TWITTER-BERT is starting to get a bit old. It was trained in the beginning of the pandemic. It still does "think" that Malone is a basketball player and that alpha, delta and omikron are letters in the greek alphabet. In some cases the stance/sentiment in a sentence requires you to know the meaning of these words. To fix this, one would have to do some additional pretraining on additional (unannotated) data. Not sure if it would have real impact in your case, just something you should think about.

Another comment is that is the possibility that the model is picking up the "tone of voice" as you describe it during finetuning. Take a minute to think about the process of finetuning a classification task. Lets say you have the task of pro/anti vaccine. You do some annotation, and put the "pro" in pile A and "anti" in pile B. In real life, a lot of these categorisations are really hard. Inter-rater reliability on tasks like this is typically below 0.8. Then you are finetuning your model on this. However, you are no longer finetuning on pro vs anti vaccine. You are finetuning on recreating pile A and pile B. There are a lot of other ways of recreating these piles, for instance the use of specific words, or their anger, or their use of CAPS LOCK.

There are ways of getting around this problem. One approach is to do the classification target specific (where you hint to the label of the piles to give the classificator a hint about what you are looking for). Another approach is not to train on the classification task, but instead view this as a logical task. We have made an mnli-version of the model that can be used for that.

Best of luck with the competition!

from covid-twitter-bert.

A few observations. about covid-twitter-bert HOT 2 OPEN

Comments (2)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent