This is some of the python programs I wrote to filter and transform data for my master's thesis work whose title was "Applying Natural Language Processing techniques to analyze HIV-related discussion on Social Media". The master thesis idea was to collect, filter, transform and analyze discussion that had the hashtag HIV to apply some Natural Language Processing techniques, such as sentiment analysis and content analysis, and study the results of those application. This results can then be used to detect the spread of any hashtag:
- viruses and diseases included, for public health purposes
- social studies to analyze the trend of some hashtag
Some of the algorithms are not included because they contain sensitive information that could not be shared for privacy reasons.
This study was conducted thanks to an Erasmus program at Aalto University, with the collaboration of Polytechnic University of Turin.
A journal publication based on this study will be made in collaboration with both professor from Aalto University, Finland, and Ohio University, USA.