Code Monkey home page Code Monkey logo

lacounty_covid19_data's Introduction

LA County COVID-19 Data Set and Tools for Data Scientists

This repository contains code (in the form of Python scripts) to obtain and visualize data about confirmed positive cases of COVID-19 in the cities and communities within LA County. It also includes sample data obtained from these scripts as well as sample plots.

We also post the latest plots every day on the following website: CoVID-19 Plots for LA County.

Data Source

The Los Angeles Department of Public Health do a press release every day, which contains information about the number of CoVID-19 cases in Los Angeles County and its neighborhood. We provide a pointer to some of the press releases that were used for scraping the data below:

Scripts

The scripts folder contains the python files needed to run and produce the data and plots in this repository. The script titled fetch_and_store.py web scrapes data from the above web sites and stores them in a JSON file for further processing, visualization, and analytics purposes. Also, we provide several scripts that plot heat maps and graph the risk estimation value for communities across time—these are prefixed with 'plot_' in the file name. These scripts have been created to process the press releases starting from 16th of March to 27th of March.

Requirements

The requirements.txt file contains the modules needed to run these scripts and can be installed by running any of the following in the terminal:

  • pip install -r requirements.txt
  • conda install --file requirements.txt

Data

For individuals interested in the data, you'll find the data folder to be useful. We provide CSV files of daily Covid-19 cases by community—file named Covid-19.csv. Similarly, this information can be found in JSON files, where the keys represent the "day" in March and the values denote the cases in each community in LA county—files named lacounty_covid.json and lacounty_total_case_count.json.

Plots

We have generated plots using the data retrieved from LA county press releases. These plots show the time-series data for confirmed COVID-19 positive cases (daily) and fatalities in the communities and cities within LA County that are showing the most number of cases.

Questions

For any questions about this data set or tools, please contact Dr. Gowri Sankar Ramachandran ([email protected]) or Prof. Bhaskar Krishnamachari ([email protected]).

lacounty_covid19_data's People

Contributors

gshanr avatar mehrdadkiam avatar caravansary83 avatar bkrishnamachari avatar francisco-avalos avatar

Stargazers

Anusha avatar  avatar Jia Pengyue avatar Arbal avatar Abhimat Gautam avatar  avatar Dorothy Cooperson avatar Fiona Guo avatar Ramanathan Ramakrishnan avatar  avatar Anastasija Mensikova avatar Flora Jiang avatar wwang avatar Saifhashemi Arash avatar Lorenzo A. Rossi avatar Haoyu Guo avatar

Watchers

Mike Head avatar James Cloos avatar Pedro Henrique Gomes avatar KSK avatar Sampad Bhusan Mohanty avatar  avatar  avatar Yutong Gu avatar Diyi Hu avatar Saifhashemi Arash avatar Dorothy Cooperson avatar

lacounty_covid19_data's Issues

plots

On the plots:

  1. would be good to have the x axis start from 1 and be labeled "Days since March 16, 2020"
  2. would be good to have also plots that show the y-axis in log scale (will need it soon enough :( )

Dates

Dates should be in yymmdd or ddmmyy or the like else it will get confusing. Just 16 for march 16 will cause confusion when April 16 roles around, and so on.. We should be careful then not to use date as the counter.

One approach is to have a days counter that has its start date as March 16, and then from that point on increments by one for each new date.

We can create a function that maps from the counter to a mmddyy type date string where the latter is needed.

Data structure of this file

What do the index numbers mean? Why is it not structured by classification and total? e.g.:

{ "deaths": "12345", "cases": "12345678" }

The way the data is currently structured has very little value for extrapolating the context of how the fields are related to the categories of COVID.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.