The coronavirus-stats from bhardin0009

CSV Data on Coronavirus (COVID-19)

This repository contains data in (CSV format) which are scraped from reliable sources (e.g. World Health Organisation).

Data are scraped a few times daily and pushed back to this repository together with generated charts (.PNG files).
Data scraping are automated with Github Actions.
Look for those CSV direct link below to get the scraped historical data.
Another repository related to news scraping is available at https://github.com/alext234/coronavirus-news/blob/master/README.md

Aggregate sites

BNO News

Below are international stats, excluding China.

CSV direct link

Bar chart of the latest snapshot.

WHO & Government sites

From WHO (World Health Organisation) Situation reports

Data are scraped from these reports which are in PDF formats. New reports are released daily.

Globally confirmed cases

CSV direct link

Stats from China

This page has the realtime stats from China. Data are pulled several times a day by the pipeline.

All cases in China

CSV direct link

Stats from Australia

Data is pulled from Department of Health website.

Cases in Australia

CSV direct link

Stats from Singapore

Data are scraped from the MOH (Ministry of Health) local situation web page.

Cases in Singapore

CSV direct link

From US CDC (Centers for Disease Control and Prevention)

Cases in the US (data are scraped from here)

CSV direct link

Chart for US is not plotted due to change in the way stats are collected.

How it works

Jupyter notebooks are used for scraping data and output to CSV files
These notebooks are executed on a schedule by Github Actions pipeline to scrape new data
This pipeline also commits back new data to this repository

Development

Tools: Python3, Jupyter, Pandas, BeautifulSoup and related stuff (e.g. Selenium for web-scraping). It is recommended to start the development environment with this docker image, which is also used for the Github Actions build pipeline.

docker run  -p 8888:8888 -it -v $PWD:/stats -w /stats alext234/datascience:latest  bash

requirements.txt contains Python dependencies

pip install -r requirements.txt

Start Jupyter notebook from inside the container and then visit the browser at http://localhost:8888

jupyter notebook --allow-root --ip=0.0.0.0

Contributions

Feel free to create new issues for any potential data source worth scraping.
Pull requests are welcomed!

Repo status and stats

Stargazers

Last update from pipeline

Pipeline status

bhardin0009 / coronavirus-stats Goto Github PK

coronavirus-stats's Introduction

CSV Data on Coronavirus (COVID-19)

Aggregate sites

WHO & Government sites

From WHO (World Health Organisation) Situation reports

Globally confirmed cases

Stats from China

All cases in China

Stats from Australia

Cases in Australia

Stats from Singapore

Cases in Singapore

From US CDC (Centers for Disease Control and Prevention)

Cases in the US (data are scraped from here)

How it works

Development

Contributions

Repo status and stats

coronavirus-stats's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org