NLP models to detect the stance toward vaccination and the content type of the Arabic tweets that are related to COVID-19 vaccinations.
The project is divided into three main sections:
1- Preprocessing: contained 3 main processes, Regex replacements, Stemming, Tokenization.
2- Feature Extraction: Here we used these features Word Embeddings, TF-IDFF, Bag of Words beside hand-picked features
3- Classification: We trained different models BERT model ( aubmindlab/bert-base-arabertv02-twitter), RNN, LSTM and classical ML classifiers (Multinomial NB, LR, Linear SVM, SVM, RF, DT).
Marim Naser |
Mariem Muhammed |
Ammar Mohamed |
Omar Ahmed |