cognizant-ai-labs / covid-xprize Goto Github PK

View Code? Open in Web Editor NEW

37.0 12.0 70.0 10.2 MB

Open-source repository containing examples and documentation for the Cognizant XPRIZE Pandemic Response Challenge

License: Other

Jupyter Notebook 35.71% Python 64.10% Dockerfile 0.19%

covid x-prize xprize pandemic response challenge covid19 covid-19 ai machine-learning

covid-xprize's Introduction

XPRIZE Pandemic Response Challenge

Introduction

Welcome to the XPRIZE Pandemic Response Challenge! This repository contains what you need to get started in creating your submission for the contest.

Within this repository you will find:

Sample predictors and prescriptors provided by Cognizant, in the form of Jupyter notebooks and python scripts
Sample implementations of the "predict" API and the "prescribe" API, which you will be required to implement as part of your submission
Sample IP (Intervention Plan) data to test your submission

Pre-requisites

To run the examples, you will need:

A computer or cloud image running a recent version of OS X or Ubuntu (Using Microsoft Windows™ may be possible but the XPRIZE team and Cognizant will be unable to support you.)
Your machine must have sufficient resources in terms of memory, CPU, and disk space to train machine learning models and run Python programs.
An installed version of Python, version ≥ 3.6. To avoid dependency issues, we strongly recommend using a standard Python virtual environment with pip for package management. The examples in this repo assume you are using such an environment.

Having registered for the contest, you should also have:

A copy of the Competition Guidelines
Access to the Support Slack channel
A pre-initialized sandbox within the XPRIZE system

Examples

Under the covid_xprize/examples directory you will find some examples of predictors and prescriptors that you can inspect to learn more about what you need to do:

predictors/linear contains a simple linear model, using the Lasso algorithm.
predictors/lstm contains a more sophisticated LSTM model for making predictions.
prescriptors/zero contains a trivial prescriptor that always prescribes no interventions; prescriptors/random contains one that prescribes random interventions.
prescriptors/neat contains code for training prescriptors with NEAT

The instructions below assume that you are using a standard Python virtual environment, and pip for package management. Installations using other environments (such as conda) are outside the scope of these steps.

In order to run the examples locally:

Ensure your current working directory is the root folder of this repository (the same directory as this README resides in). The examples assume your working directory is set to the project root and all paths are relative to it.
Ensure your PYTHONPATH includes your current directory:
```
export PYTHONPATH="$(pwd):$PYTHONPATH"
```
Create a Python virtual environment
Activate the virtual environment

Install the necessary requirements:

pip install -r requirements.txt --upgrade

Start Jupyter services:
```
jupyter notebook
```
This causes a browser window to launch
Browse to and launch one of the examples (such as linear) and run through the steps in the associated notebook -- in the case of linear, Example-Train-Linear-Rollout-Model.ipynb.
The result should be a trained predictor, and some predictions generated by running the predictor on test data. Details are in the notebooks.

XPRIZE sandbox

Upon registering for the contest, you will have been given access to a "sandbox", a virtual area within the XPRIZE cloud within which you can submit your work.

Submitting a predictor

In order for the automated judging process to detect and evaluate your submission, you must follow the instructions below. If your script does not conform to the API in any way, your submission will be omitted from judging.

Within your sandbox, under your home directory you will find a pre-created work directory.
Under this work directory, you must provide a Python script with the name predict.py. Examples of such scripts are provided in this repository. This script will invoke your predictor model and save the predictions produced.
Your script must accept particular command line parameters, and generate a particular output, as explained below.
Whatever models and other data files your predictor requires must be uploaded to your sandbox and visible to your predict.py script, for example, by placing them in the work directory or subdirectories thereof.
Expect that the current working directory will be your sandbox work directory when your script is called. Therefore, references to other modules and resource files should be relative to that.

Expect your script to be called as follows (the dates and filenames are just examples and will vary):

python predict.py --start_date 2020-12-01 --end_date 2020-12-31 --interventions_plan ip_file.csv 
  --output_file 2020-12-01_2020_12_31.csv

It is the responsibility of your script to run your predictor for the dates requested (between start_date and end_date inclusive) and generate predictions in the path and file specified by output_file, using the provided intervention plan. Take careful note of the performance and timing requirements in the Competition Guidelines for running your predictor.

For more details on this API, consult the Competition Guidelines or the support Slack channel.

Submitting a prescriptor

Within your sandbox, under your home directory you will find a pre-created work directory.
Under this work directory, you must provide a Python script with the name prescribe.py. Examples of such scripts are provided in this archive. This script will invoke your prescriptions model and save the prescriptions produced.
Your script must accept particular command line parameters, and generate a particular output, as explained below.
Whatever models and other data files your prescriptor requires must be uploaded to your sandbox and visible to your prescribe.py script, for example, by placing them in the work directory or subdirectories thereof.
Expect that the current working directory will be your sandbox work directory when your script is called. Therefore, references to other modules and resource files should be relative to that.

Expect your script to be called as follows (the dates and filenames are just examples and will vary):

python prescribe.py --start_date 2020-12-01 --end_date 2020-12-31 --interventions_past ip_file.csv 
  --output_file 2020-12-01_2020_12_31.csv

It is the responsibility of your script to run your prescriptor for the dates requested (between start_date and end_date inclusive) and generate prescriptions in the path and file specified by output_file. Take careful note of the performance and timing requirements in the Competition Guidelines for running your prescriptor.

Example prescriptors can be found under covid_xprize/examples/prescriptors/.

For more details on this API, consult the Competition Guidelines or the support Slack channel.

Trained standard predictor

The repo also provides a trained standard predictor to train prescriptors against. To use it, call covid_xprize/standard_predictor/predict.py to make predictions. See get_predictions in covid_xprize/examples/prescriptors/neat/utils.py and generate_cases_and_stringency_for_prescriptions in prescriptor_robojudge.ipynb for examples of how to make this call.

More information/Support

For more information and support, refer to the competition guidelines or post your questions in the support Slack channel; you should have gained access to both of these when you created a login in the competition platform.

For a concrete visualization of what the competition is about, see Cognizant's COVID-19 intervention optimization demo. Using this dashboard you can select among different prescriptors from the Pareto Front to see the effect on prescriptions for intervention plans in various regions.

For more background information please see also the research paper From Prediction to Prescription: Evolutionary Optimization of Non-Pharmaceutical Interventions in the COVID-19 Pandemic.

covid-xprize's People

Contributors

Stargazers

Watchers

Forkers

kishansol b7nr wankew22 anibalsolon ingeniium dingz9926 nathanhoang5 bradyneal brookzhcn ernestlessenger brianturnwald mardom jasonzliang asaliou0809 bhavintandel-forks krishnans14 vanessa920 gdhzlz marinweydert paulcarfantan supaerodatascience riccardoscheda cepedus drwitt mariedata360 ayushsurana710 nityakasturi ruchi-651 alphanumericslab transatlantic-team nixtamalai georgian-io-archive muditjai neamad88 inavamsi johnatvh mrebollo oriankeith001 tebogonakampe amstqq violet-spiral opeyemibami 1201amit corollary-health 7feiw kaiyan289 mat092 rsameni lnb11855 tim-sumner rafaie clineer kushantharaka97 fanshijianpharmacy malozano jhna246 ishann guanlin-w agarwalsiddhant10 ordovas

covid-xprize's Issues

Create "guidelines for judges" doc

This isn't specifically called out in the Guidelines but seems a "nice to have" to ensure consistent standards across all judges.

The doc should give judges some idea of what to look for, what criteria to use, where to give extra points (and where to deduct points!)

Of course judges are experts in their areas and will have considerable leeway and discretion.

Forums vs Slack channel

We need to provide the teams with a way to discuss technical issues, ask questions about the competition, discuss data or ml tracks, etc. For that we could use https://community.xprize.org/discussions or create dedicated Slack channels. If we choose the community forums we will have to provide a moderator, and answer some questions.

Predict PredictedDailyNewDeaths and 95CI

From Toby's comments on the predict.py's prediction file:

Also: PredictedDailyNewDeaths, also 95CI (or SD) for both.

Fine for cases to be headline and leaderboard, but the rest is important for judging.

Should we add them? Optional or mandatory fields?

Implement a weighted Stringency metric

Some parts of the contest call for measuring Stringency, which we will do according to the Oxford University specification.

Currently nothing about stringency in RoboJudge.

Stringency used in prescriptors but not yet weighted.

Create simple web UI with Dash to display rankings + predictor performance

Steal steal steal from existing Robojudge notebook.

Something simple as agreed with @ofrancon -- host in EC2, Dash web app.

Sequence Diagram for X-Prize and LEAF Interactions

Generate a sequence diagram with our current assumptions of what will go back and forth between the LEAF side and the X-Prize side starting today and spanning the duration of the competition.

Give unique names to tmp files in prescriptor example

This could make the code more robust to unintentional overwrites, and more easily parallalizable.

Refine predictions validation

Following a discussion with @EKMeyerson : we need to validate the Predictors' predictions in a few more ways:

Note: in order to work, the *ip.csv file should contain ALL NPIs since 2020-01-01, to guarantee NO GAP in NPIs.
Call 2020-01-01 the inception date.
Hopefully these validations makes the "contract" clearer for participants.
We could provide these validations as some kind of unit tests
Note: cases are NOT provided by the API's params. Models have to store locally whatever they can before cut-off date, aka submission date and their loss of internet connection.

4 days right after "submission date", which means the model has a access to the data up to start_date -1
We already have one similar, except for the IP file that should contain all NPIs since inception:

!python predict.py -s 2020-08-01 -e 2020-08-04 -ip ../../validation/data/2020-01-01_2020-08-04_ip.csv

1 month in the future
For instance:

!python predict.py -s 2021-01-01 -e 2021-01-31 -ip ../../validation/data/2020-01-01_2021-01-31_ip.csv

1 month in the past, but with different NPIs (counterfactuals)

!python predict.py -s 2020-04-01 -e 2020-04-30 -ip ../../validation/data/2020-01-01_2020-04-30_ip.csv

That can give us interesting counterfactuals. For instance, what would have happened if each npi was +1 stricter? what if -1 stricter? => Interesting for qualitative checks.
Also to explain things like 70% of the predictors say case would have been 50% less if NPIs had been +1 stricker, 20% say 75% less, 10 say +25% more (for instance)

6 months in the future
Assuming 6 months is our maximum prediction horizon. Maybe explicitly say 180 days max. Maybe range rather than horizon.

!python predict.py -s 2020-08-01 -e 2021-01-28 -ip ../../validation/data/2020-01-01_2021-01-28_ip.csv

Rationale: the ip file contains the actual IPs as long as they are known, and after that some scenarios like 'frozen' NPIs. I'd like to use that
Note: we also need to validate that the prediction is done under a time limit of ** 1 hour **. This one hour is totally arbitrary for the moment. We should discuss what it corresponds to (in terms of sandbox seconds per region per day of prediction for instance)

Single region

!python predict.py -s 2020-08-01 -e 2020-08-04 -ip ../../validation/data/2020-01-01_2020-08-04_Italy_ip.csv

the Italy IP would contain NPIs for Italy only, and we should validate that we get predictions for Italy only.

Multiple regions

!python predict.py -s 2020-08-01 -e 2020-08-04 -ip ../../validation/data/2020-01-01_2020-08-04_USA_ip.csv

Here the USA IP would contain USA + the 50 states.

Country and Region identification

We're currently using CountryName and RegionName.

Figure out which standard Oxford is using for CountryCode and RegionCode
Figure out if we should use CountryCode and RegionCode in addition to or in replacement of CountryName and RegionName

Remove CUTOFF_DATE from LSTM example

The LSTM example relies on a 'CUTOFF_DATE', but it's not needed anymore after #92 . Remove it.

Predict API: specify a path to the output file

In the predict.py API, specify a path to which to save the output file.

Update prescriptor example to fill any case data gap

The current neat example assumes case data is available until start_date. This is not true in general, so the predictor should be used in prescribe.py to fill the missing data.

Complete validation module

As discussed with @ofrancon there are extra items that could (maybe should) be included in Submission Validation. Discuss and decide which items should be included, and complete the validator.

End-to-end test

We need a proof of concept end-to-end run to prove out the system:

Submission in sandbox
Robojudge automatically generates prescriptions and evaluates them
Submission shows up in leaderboard

Consider adding an `output-file` parameter to the `predict.py` API

In order to be able to call the predictor multiple times, consider adding an output-file parameter to the predict.py API.

Would allow things like:

2 calls for the predictor for same dates, but with different NPI scenarios
specific calls for the same date but for specific regions

E.g.:

!python predict.py -s 2020-08-01 -e 2020-09-30 -ip /intervention-plans/2020-08-01_2020-09-30_strict.csv -o /predictions/2020-08-01_2020-09-30_strict.csv

!python predict.py -s 2020-08-01 -e 2020-09-30 -ip /intervention-plans/2020-08-01_2020-09-30_UK.csv -o /predictions/2020-08-01_2020-09-30_UK.csv

List of software/models we create or have to be loaded into the X-Prize Sandbox

We need the list of software and models that we are loading into the X-Prize sandboxes for Cognizant legal to include in the agreement please.

Robojudge: compute a 7 days moving average metric

Current metric is the diff in daily new cases between predicted and actual (true)
Let's call it DiffDaily
A better metric would be the diff between the actual 7 days moving average and the predicted 7days moving average
Let's call it Diff7DMA

Robojudge: Handle NaN in predictions

If a predictor makes a NaN prediction, make sure the error (diff) is still computed, and not NaN.
It's ok interpret a NaN prediction as 0
But it's not ok to interpret a NaN error as 0: it would make such a predictor the best possible predictor.

Like "Linear" in this example:

Clarify Prescriptor API: should a csv be passed to the function? Or should they use Oxford as a data source?

Resolve what to do with teams that use existing models

Can we get some of these teams, referenced by the CDC for its forecasts, to participate? How would we handle teams that "copy" these models?

Validate submissions

Write a script that validates submissions:

Expected column names
No NaN predictions
No negative predictions
For each country, each region:
- No missing day (between start_date and end_date)
- No duplicated day

Proposed API:

validate_submission(start_date, end_date, submission_url)

We can add more validation rules as we go.

Implement the API for the linear example

Make sure the linear rollout example implements the API and passes the validation

Implement Bookmarks

We should probably use the 'bookmark' plugin that XPrize will publish this week (version 1.2) for the Oxford dataset. From what I understand a bookmark points to a dataset, and we can specify 'latest' or a specific version.

Create POP registration process

Candidates need to be able to sign up, pay registration fee, download zip, agree to legalese etc. etc.

It might be that the X-Prize folks (Max etc.) will 100% handle this but we will likely need to at least review the process.

Registration should also grant access to a team-specific Sandbox.

Generate zip file for registrants

May be able to generate directly from Github. Nice to have: automated.

Fix layout of examples

For symmetry and easier understanding, make it so examples has subdirectories prescriptors and predictors.

Note: some paths in the predictor notebooks may have to be fixed up when doing this.

Create a README.md

We need to provide participants with a README.md that explains what they have to do (predict.py) and what they should look at (the example notebooks).

Refactor prediction-prescription loop in example

In the neat example, train_prescriptor.py and prescribe.py contain a similar chunk of code for running prediction-prescription roll outs. There may be a nice way to factor out this code so that it is shared and is parameterized in a clarifying way.

Implement masks NPI

We are dependent upon Oxford University adding it to the data. Figure out what we need to do (if anything?) to add this new NPI.

Prepare repo for open sourcing

Ensure licenses, copyright notices etc. are in place

Multi-language support in Sandbox

Pyhon and/or R: Do we want to limit entries to Python or do we want to allow R too? My 2 cents: on one hand we would attract more teams and submissions, on another hand I don't think we would be able to provide feedback / judge R submissions

Need a UI to display and compare entries

Graphs showing ground truth + predictors from submissions

Discuss and ballpark expected number of participants

Max had a good question: how many teams do we expect? If we announce on September 29, and have a registration deadline on October 11, do we give enough time for teams to hear about the competition, organize, and register?

Automatically run unit tests

We now have unit tests for the LSTM example model and for the submission validation function. These tests should be run automatically.

Decouple Robojudge notebook into eval + ranking

Finalize judging panel

Determine who will be on the panel and make sure they are aware of their responsibilities and the tasks required of them

Set up CI to run unit tests

Validator for generated prescriptions

Add a way to validate that generated prescriptions are valid.

contains each country and region
contains each day
prescribed NPI values are within the valid bounds for each NPI (e.g. integers between 0 and 3)

Estimate compute power needed for phase 1 daily and final evaluations

According to the manual we will have 200-300 teams competing in phase 1, and we need to select 30-50 finalists, which means we need to run their scripts daily in order to produce rankings over the 6 week trial period. Other evaluations across the teams may also be needed to produce quantitative measures. Some qualitative measures, such as generalization to other geographies may also need compute power. We need a high level estimate of the compute needs.

Add example that uses weighted IPs

Extend neat example to include weighted IP input.
Extend prescriptor robojudge to weight IPs when computing stringency.

Implement Oxford University datasource

Figure out how to implement it as an "official" datasource in an X-Prize submission.

Set up Slack channel for competitors to peer-support each other

Ensure LEAF team has access. X-Prize folks will have to set this up but we need to track it to make sure it happens.

Testing/exploration environment

(Normally we'd just call this a "sandbox" but that term already has special meaning in the X-Prize context.)

We can ask "beta testers" (in mind: Mohak, maybe some people from the research team, Risto's TA) to create an account on data.xprize.org and then we can send their email address and name to Krishna for him to provide them with a sandbox.

Provide a scenario generator

Provide a way to generate a "scenario".
A scenario is an IP file with all IP since inception until an end date.
It can be for 1 country / region or many
It can contain only historical IP or modified IPs like frozen IPs, max IPs or min IPs, etc., between 2 dates.

Wrap one of our NPI dashboard prescriptor into the prescribe.py API

To give as an example to participants, wrap one of our prescriptors into the prescribe.py API.
Note:
Means copying over some code that does the roll out.
Means using an 'official' predictor.

Ranking for predictors (round 1)

How to rank -- simple MAE ranking?

How to decide which predictors/teams go forward to Phase II? What if one team does well on Taiwan but another does well on a totally different type of region like the USA?

API for predictions

Max is looking into a few options regarding the "API" itself: how we could run a script to generate a prediction file, where to store the prediction files (S3?), and where to store the rankings file. XPrize can probably host the ranking page itself.

Set up Robojudge scheduling

At a designated time each day, judge should evaluate submissions and upload results to S3 for display by Dashboard.

Avoid off-by-one by making API more clear

There may be some small but potentially critical details to clear up for exactly what data will be available and what needs to be output in prediction and prescription API. E.g., do I know today's IPs when I predict cases for today? or just yesterday's IPs?