In this repository I scanned all geotagged tweets sent in 2020 to monitor for the spread of the coronavirus on Twitter. About 2% of all tweets are geotagged everyday. I am working with about 1.1 billion tweets in this database. I learned how to work with large scale datasets effectively. I also worked with multilingual text such as English and Korean in my analysis. I used shell, the vim environment, and of course python to successfully generate 4 png graphs. In this project I demonstrated how to filter, reduce and visualize large datasets into easy visuals like the graphs below!
cameronshir11 / twitter_coronavirus Goto Github PK
View Code? Open in Web Editor NEWThis project forked from mikeizbicki/twitter_coronavirus