Code Monkey home page Code Monkey logo

reichlab / covid19-forecast-hub Goto Github PK

View Code? Open in Web Editor NEW
439.0 25.0 328.0 13.14 GB

Projections of COVID-19, in standardized format

Home Page: https://covid19forecasthub.org

License: Other

R 2.40% Shell 0.21% JavaScript 3.08% Python 1.97% HTML 9.29% Vue 0.66% CSS 0.74% TypeScript 0.41% Dockerfile 0.01% Jupyter Notebook 81.10% SCSS 0.12% Makefile 0.02%
covid19 forecasts covid-19 forecast-data covid-data github-pages visualization analytics

covid19-forecast-hub's People

Contributors

aaronger avatar aniruddhadiga avatar deankarlen avatar elray1 avatar epideep avatar eycramer avatar frostxtj avatar fvbttu avatar gcgibson avatar github-actions[bot] avatar hannanabdul55 avatar jarad avatar jturtle avatar katiehouse3 avatar kraus-stat avatar michaellli avatar micokoch avatar mkim425 avatar mzorn-58 avatar nickreich avatar rjpagano avatar robertwalraven avatar serena-wang avatar shanghongxie avatar starkari avatar stevemcconnell avatar taosunvoyage avatar xinyuexiong avatar youyanggu avatar zyt9lsb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid19-forecast-hub's Issues

Add list of teams to ReadMe?

Add a subsection enumerating teams / sources of forecasts we are planning to include + links to their repositories or websites

Standardize processed data filenames

In addition to the missing fields in #66, the newest MOBS processed data file has a filename that is non-standard (for me) and is causing me issues with reading processed data for the shiny data-processing app.

Although I could update the data reading script, I think the real issue is that we don't seem to have a standardized filename processed data. I was assuming "-" was a reserved character such that the files are named

YYYY-MM-DD-team-model.csv

Can we set this as a standard?

Write plausibility checks

Write a script that does some plausibility checks for cleaned data, eg:

  • no quantile crossing
  • quantiles for cumulative deaths greater or equal than those for incident
  • quantiles for cumulative deaths non-decreasing over time
  • cumulative week-ahead and corresponding day-ahead forecasts coincide
    Maybe related to #13 ?

migrate to a clearer structure for what forecasts are made when

There are two competing priorities here:
(1) record all (or nearly all - do we really want to store every update, even if daily?) forecasts made by teams, as they make them [useful for "tracker"-like sites that want all versions and real-time updates]
(2) record forecasts made by teams that are available at a specific time, and use them to build an ensemble. realistically, for the foreseeable future we might just want to update the ensemble once a week. [useful for our standardizing our ensemble]

Here is one proposal for how to do this:

  • we have the data-processed directory contain all (or nearly all) forecasts from each team. no restrictions on when these forecasts are submitted.
  • each file is marked with the date the forecast was made. This would change a bit our restriction right now that these YYYY-MM-DD's only refer to Mondays. I'm going to refer to this date in the filename as fcast_date below.
  • we set really clear guidelines for when "1 wk ahead" means epiweek(fcast_date) and when it means epiweek(fcast_date)+1. for example, we say that if weekday(fcast_date) is Thursday, Friday or Saturday, then "1 wk ahead" means epiweek(fcast_date)+1 and otherwise epiweek(fcast_date). (I don't feel that strongly about where the threshold is for switching over. Could be Tuesday, could be Thursday.)
  • to reinforce this and avoid inadvertent errors in assignment of targets to days/weeks, we could also accept a new column name in the files that would be end_date, so files submitted with fcast_date of 2020-04-23 (thursday of EW 17) or 2020-04-27 (Monday of EW 18) would both have a "1 wk ahead" forecast with end_date of 2020-05-02 (Saturday of EW 18).
  • on Mondays at a fixed time (6pm ET?) we run an ensemble script that finds all available forecasts from a team made since the preceding Thursday (i.e. 4 days prior) and takes the most recent forecast to include in the ensemble.

add additional validations

Some additional validations

  • ensure that we are checking for all required column names as required by the repo (right now we are requiring forecast_date and target_end_date which are not part of Zoltar) can we require these?
  • are we validating the FIPS locations based on the specific set of valid numbers, or just any string of a number between 01 and 95? I would prefer the former, so we are doing it specifically for accepted FIPS.
  • can we institute a more complex check to ensure that people are aligning forecast_date and target_end_date correctly? I will explain more below.
  • Require point estimates (exactly one point estimate per location/target tuple) - we know from Katie's code that the forecast_date column is the same for the entire file (based on filename)
  • update https://github.com/reichlab/covid19-forecast-hub/wiki/Validation-Checks

Separate forecasts from truth

I suggest we reorganize the data so that forecasts are separate from truth, e.g.

data-raw/forecasts
data-raw/truth
data-processed/forecasts
data-processed/truth

The subdirectory structure within the forecasts/ subdirectories would be the same as it is now.

Also, perhaps we should include nytimes "gold-standard" data in addition to the JHU data.

Move processing scripts to data-raw/ folders

Currently most of the code is in the code/ directory and recently organized into subdirectories. As a general principle, I suggest we move code closer to the data it is used on. For example, I suggest we move raw data processing scripts to the data-raw/ folder.

The code/ directory could still be used for functions (rather than scripts) that are used in multiple scripts.

Remove Imperial ensemble forecast files from data-raw/ folder

All team forecasts should be in a subdirectory of data-raw/, but these files

https://github.com/reichlab/covid19-forecast-hub/blob/master/data-raw/2020-04-19-Imperial-ensemble1.csv
https://github.com/reichlab/covid19-forecast-hub/blob/master/data-raw/2020-04-19-Imperial-ensemble2.csv

are directly in the data-raw/ folder.

I would create a pull request, but these files have differences to the files in the data-raw/Imperial subdirectory, so I'm not sure which versions should be preserved.

Add forecast_date, target_end_date to 2020-04-13 CU data-processed/ files

The following files need required fields forecast_date and target_end_date:

data-processed/CU-60contact/2020-04-13-CU-60contact.csv
data-processed/CU-70contact/2020-04-13-CU-70contact.csv
data-processed/CU-80contact/2020-04-13-CU-80contact.csv
data-processed/CU-nointerv/2020-04-13-CU-nointerv.csv

update target list

what are the next-phase targets that we want to include? likely we should phase these in slowly, to reduce strain on creating checks, visualizations, ensembles, for new targets. candidates are:

  • incident hospitalization demand by week/day?
  • ICU bed demand by week/day?
    ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.