RandomForest Classifier and Naive Bayesian Classifier to determine if a Twitter User is a Brand account (Eg. Dominos Pizza etc), or not.
RandomForest Accuracy: 71%
Naive Bayes Accuracy: 73%
Dataset: https://www.kaggle.com/crowdflower/twitter-user-gender-classification
K-Means and DBSCAN Clustering on Word2Vec Representations of the corpus derived from Tweets.
Interesting results: Some semantic meaning seems to have been captured in the clusters with for eg. 'Brother' and 'Sister' in the same cluster, 'College' and 'School' in the same cluster
Dataset: https://www.kaggle.com/kazanova/sentiment140
A-Priori and FPGrowth to mine frequent association rules among Twitter celebrities.
Interesting results: People who are following @joshuadun are likely to follow troyesivan, tylerrjoseph, twentyonepilots with 92% confidence.