institutefordiseasemodeling / covasim Goto Github PK

View Code? Open in Web Editor NEW

242.0 24.0 218.0 14.95 MB

COVID-19 Agent-based Simulator (Covasim): a model for exploring coronavirus dynamics and interventions

Home Page: https://covasim.org

License: MIT License

Python 98.97% Shell 0.26% TeX 0.77%

covid-19 agent-based abm simulation model coronavirus npi contact-tracing stochastic epidemiology

covasim's People

Contributors

Stargazers

Watchers

Forkers

haohu1 juangon inc0 tinghf vishal-wiai emeverson akoniges jules2689 adityasharad rasmuswl gwincr11 mfisher-idmod deviozc devclinton aillimen sayers24 akjagadish haleyaaron liud73 rnunez-idm darash-desai jsfenfen wpettine stitova-idm willf braybaud dlukacevic-idm altazine rpezer jschripsema-idm lgeorge-idm sanjaya1995 amath-idm elsayed68 cwiswell-idm jasminapg lizardman1999 bosetinsky devenlu kgustafs neillbyrne mihirkhandekar davidthaler xc308 rebeldesigns-net lskrip-idm pselvaraj87 tossman danielobembe gehoon irenenyy gtadiparthi chuazh mturchin bmudimu nyu-idm-covid hkim207 covasim-south-africa kjamessoda simonparkershames jiangfangfangxm woaiqingqing vishalbelsare aasgary sbuxton-idm pradyraja leftygray clairevalva aangel79 skroiss-idm brain-modelling-group raunakdune iopetrid ayakorokoro 1-kelvin saizor rbelew yh-zhou-smmu-ams epotvin mair-simon jacksoup thecapitalistcycle rwestern inawords chenguoqiang-transportation-boy bitbasica seeingwu p-robot clementleong ingmarschuster imcatta jaoeya rlatkows lizfearon spet2094 pie923 tti-modelling migueleps debsf5 epiziki01

covasim's Issues

CD pipeline

Whenever we release, build, push and update pypi, assuming infra is ready we could also automatically deploy new version to Azure

Setup new deployment infrastructure on Azure

Currently webapp runs on single VM. We should set it up in highly available, load balanced and otherwise better way.

UI - Support for interventions for specific time frames

The user group wants to know the output when an intervention is implemented, lifted, then re-implemented at a future date.
Possible solution: A range slider for intervention-start-day with a "+" button to add a new point on the slider for intervention-end-day. The "+" button could then add a third point for a re-implementation day.

Project management tooling

We are starting to get more people involved in this project. It maybe nice to allow some people in the org do some simple project management to make it easier to tell what is open to be worked on and what is in progress.

Proposal:

Setup a simple project board with the backlog, in progress and finished columns. Setup automation on this board to automatically update when possible.
Allow some of the core contributors to self assign issues.
Determine who is in charge of reviewing and merging code.
Define a process for requesting reviews and reviewing/merging items.

Automatic calibration

Once #43 is done, we need to calibrate to the data. The Sim object already has a likelihood() method, but this needs to be improved (more data types, probably using cumulative rather than daily counts, etc.). We also need to decide what model parameters will be used for fitting -- potentially just beta and n_infected, but if others, may need a smarter optimization approach. We are almost there with a fancy method called BINNTS (bootstrapped iterative nearest-neighbor threshold sampling), but it's not quite ready to go, so could also look into an off-the-shelf method, especially if we just want the MLE and don't need the whole posterior.

Enable multiprocessing

Currently the simulation runs only on single core. Can we add multiprocessing to make use of more workers and run the simulation faster?

CI system is broken right now

Checkout action fails, probably after move from private repo

Improve code performance

Investigate ways to reduce model run time -- more use of arrays rather than loops, sparse matrices, Numba, etc.

[UI] Allow data upload

Need a button to upload data. Format of data should be like
https://github.com/InstituteforDiseaseModeling/covasim/blob/master/tests/example_data.csv
in either Excel or CSV format. See the test
https://github.com/InstituteforDiseaseModeling/covasim/blob/master/tests/test_sim.py#L121
for usage. If uploaded, data should plot against sim results (as happens on the backend).

Related issue: #50

Include healthsystems into web

Currently, there are no configurations in webapp to address eventual shortages of, for example, ICU beds. It would be great to include these into webapp. This will mean that when severe cases pass pre-configured threshold, mortality increases.

Improve data and likelihood functions

Avoid hard coding, more flexible file reading, and better likelihood calculations

Load CSV files
Load all columns
Automatic column matching
Include in plotting / remove data_mapping
Extend likelihood function and use cumulative counts
Write data file documentation / include example

TypeError: attrs() got an unexpected keyword argument 'eq'

Followed the instructions, but got the error in the title when trying to import covasim. Same error if I install in terminal and try to run a sim.

Thanks!

UI integration with country selection

#45 This, but on UI

Replace plotly with Altair or another cool dataviz library

Plotly, while functional, lacks some features like interactive charts and well, it's not quite as "snazzy" as, for example, https://altair-viz.github.io/getting_started/overview.html

We could move our UI charts to it

Enable make_randpop to accept country as parameter

Right now https://github.com/institutefordiseasemodeling/covasim/blob/develop/covasim/people.py#L315 is hardcoded on Seattle. We could use https://github.com/neherlab/covid19_scenarios/blob/da1381c429471c3d57b02cd5a7b6dc46e645c0b3/src/assets/data/country_age_distribution.json to get per-country ratios. Later we can write our own scraper.

[UI] Include oxygen and other interventions

Work from @aouedraogo and @lskrip-IDM 's recent report to get oxygen support and other health system parameters built into Covasim

Generate age distributions per state

Use census data to generate age distributions per state, this will enable us to get integrated into unified UI quickly

Receiving type error at runtime

Hi -
After I set up the project I was able to run the first example, but not the second (python examples/run_sim.py) due to what looks like a data type mismatch. Because I didn't specify any data types in the code, I wonder if this is a bug?

See my command prompt code and log below:

(covasim) C:\Users\laurel\covasim>python examples/run_sim.py
Importing...
Covasim 0.22.0 (2020-03-31) — © 2020 by IDM
Note: synthpops (for detailed demographic data) is not available (No module named 'synthpops')

Elapsed time: 3.06 s
Making sim...
Running...
TEMP init people
  Creating 20000.0 people...
  Created 20000 people, average age 38.58 years
  Running day 0 of 60 (0.60 s elapsed)...
  Running day 1 of 60 (0.60 s elapsed)...
  Running day 2 of 60 (0.60 s elapsed)...
Traceback (most recent call last):
  File "examples/run_sim.py", line 35, in <module>
    sim.run(verbose=verbose)
  File "C:\Users\laurel\covasim\covasim\sim.py", line 396, in run
    transmission_inds = cvu.bf(thisbeta, person.contacts)
  File "C:\Users\laurel\covasim\lib\site-packages\numba-0.48.0-py3.8-win32.egg\numba\dispatcher.py", line 574, in _explain_matching_error
    raise TypeError(msg)
TypeError: No matching definition for argument type(s) float64, array(int32, 1d, C)

Add intervention parameter to the run_sim RPC

Support for the cova_app.py to accept the necessary parameters for the intervention data

[UI] Decide on list of Africa-specific interventions

Building on the list of interventions listed here:

https://docs.google.com/document/d/1Q3-cLk9lo8Oizgs4JZTnqDCBjWu3oE93S42B9rSinfY/edit#

Provide a list of interventions that would be most useful to Africa CDC (e.g., different assumptions about social distancing and testing capacity)

Enable UI ground truth overlay

Since we'll have latest data cached by #48 and countries selectable by #44, we can overlay number of confirmed cases, deaths and recoveries over simulation

Accessibility - input labels/button names/color contrast/keyboard nav

WCAG 4.1.2 - buttons must have discernible text
WCAG 1.3.1, WCAG 3.3.2 - input controls must be associated with their labels
WCAG 1.4.3 - contrast ratio between text and background must meet AA standards
Keyboard navigation - tooltip information should be accessible for non-mouse users

Deploy website on new infra

After #9 is done, we should deploy new website and point domain to it

Consider using logger instead of verbose

Currently, stdout is managed by a verbose variable passed between functions. This works, but better options might be available. Consider replacing with logger for greater flexibility.

Requirements:

Equally easy to configure, e.g. replacing sim.run(verbose=1) with sim.run(verbose='info') is OK; from covasim.utils import logger; logger.setLevel('INFO'); sim.run() is not
Same levels of detail are implemented: e.g. 0 = only warnings/errors to stdout, 1 = default output, 2 = print everything possible (debug).

@RomeshA do you want to take a look at this? cc @gwincr11

GH action to release new version

GH action that, whenever we release new version, will build and publish docker image and push new package to pypi

Refactor simulation run function

Currently https://github.com/InstituteforDiseaseModeling/covasim/blob/master/covasim/sim.py#L266 is really long function. We could try to break it up into smaller pieces, easier to test

Scrape best available epi data

This is an involved project and may even require its own repo, but creating an issue here to get the conversation started. The task is:

We need the best available auto-updated epidemiological data at as fine a geographical resolution as possible.

Specifically, the data we need is as many of the following as possible, in order of importance:

Number of deaths (on date died)
Number of positive diagnoses (on date test performed)
Number of importations (especially at start of outbreak)
Number of people hospitalized (on date hospitalized)
Number of people in ICU (on date admitted to ICU)
Number of negative diagnoses (on date test performed)

There are various tools that already collate some of this, e.g. https://neherlab.org/covid19/ and https://coronavirus.jhu.edu/map.html. The task is to find the best available data sources and collate everything into a consistent format. Top priority is Africa and LMIC countries, but as broad as coverage as possible.

Probabilities- 1.0 infection, symptomatic, severity, critical, death probabilities should not have recoveries

Found in regressing test cases.

In this simulation, every agent starts as infected, and the probabilities should put all of them in the fatal bucket:

    "n": 500,
    "n_infected": 500,
    "rel_crit_prob": 1.0,
    "rel_death_prob": 1.0,
    "rel_severe_prob": 1.0,
    "rel_symp_prob": 1.0,

The durations for each of those phases should be a constant 0 (a couple of examples)

       "crit2die": {

            "dist": "normal",
            "par1": 1,
            "par2": 0
        },

        "exp2inf": {

            "dist": "normal",
            "par1": 0,
            "par2": 0
        },

(I think I got all of them) but when the simulation runs, we only see 6 deaths and 494 recoveries.

Steps to repro:
Convert the attached TEST_sim_from_results_json.txt to a .py file
copy that .py file and the attached txt file into a directory where covasim is installed
Run the script and look at the console output

EXPECTED: 500 agents, 500 deaths
CURRENTLY: 500 agents, 6 deaths, 494 recoveries
TEST_sim_from_results_json.txt

DEBUG_test_disease_progression.DiseaseProgressionTests.txt

Scrolling is broken for app on Chrome

Run, get graphs, get space at bottom

Make image smaller

Currently our image is multi-GB in size. Waaay to much for this project.

Write scheduled GH actions to pull country epi data

For now best data we know of is available https://github.com/CSSEGISandData/COVID-19 here. We can pull time series data, process it and cache it locally with scheduled GitHub actions.

Scrape best available demographics data

Currently, Covasim with the usepopdata = False option uses a hard-coded age distribution from the US:
https://github.com/institutefordiseasemodeling/covasim/blob/develop/covasim/people.py#L315

We need to make this adaptable to as many different locations as possible. The best source for data o this is the UN World Pop:

https://population.un.org/wpp/

We need a simple way of being able to choose the population data (i.e., set the age_data variable) based on any country in the world, e.g. make_randpop(sim, location='ethiopia').

[UI] Improve frontend plotting

Related to #51, we need to think about how to fix plotting. Here's how plotting looks on the BE:

At minimum we need to plot data, but other changes would be good too. Options:

Keep Plotly and expand it
Move to Altair
Switch back to mpld3, so we can exactly reproduce backend figures

Reorder UI for intervention controls

1 pick country/region
2 specify duration
3 specify interventions (from #71)
4 (optional) advanced - rest of params

Upper limit of 56,800,235 people in the simulation

When generating random population, it uses sciris.uuid() to generate unique IDs for the population. Due to safety attribute infast_uuid() function in sciris it limits the simulation to run only for 56 million people. By default it should be able to generate 56,800,235,584 UIDs.

This is the error
ValueError: With a UID of type "ascii" and length 6, there are 56800235584 possible UIDs, and you requested 100000000, which exceeds the maximum allowed (56800235)

Scrape workplace/industry data immediately

Got a request to look at reopening workplaces, perhaps by industry type or size. So, I could use a pair of hands or more to look at https://www.bls.gov/oes/current/oes_42660.htm and grab data from there on workplaces. Specifically data on age of workers, types of industry, workplace sizes (by industry too if available), at the finest granularity you can find (this page points at the Seattle-Tacoma-Bellevue metro area). Simple csvs that can be read in as pandas tables would be great. Please reach out if you can do this and are not working on modeling itself.

Move popdict to dataframe

Right now pop dict is just that - dict. By moving this to dataftame we'll have access to better querying via pandas or full parallelization via dask later.

Enable full state runs with scaling

Covasim can run pretty well with populations up to ~100k. This is statistically enough to model even multi-milion pops, assuming we can generate sample similar to overall population.

We can include functions that will automatically scale results and iterations to show numbers in real state population. So, create 100k pop, calculate fraction of actual pop (say, for 5M state, it's going to be 0.02) and multiply all numbers by it.

@cliff one open question I'd have in this approach would be how to handle initial infections? Should we also scale them and round to ceiling? So for 10 initially infectious in 5M state, during sims we'll run 1? That will skew results quite drastically if I'm correct. Ideas?

Write export method for scenarios

See base.py:BaseSim for to_json() and to_excel() methods. Need to add these to the Scenarios class (run.py).

[CI] Compare each run against baseline

After #26 is done, we can add CI job that will compare results. It can generate it's own plot and show both of them next to each other and show difference in totals

Add code of conduct

Example: https://github.com/scikit-learn/scikit-learn/blob/master/CODE_OF_CONDUCT.md

Refactor person.py:Person.infect method

Right now this method is daisy chain of if statement, that result in personalized probabilities of death etc. We could refactor it to, instead of if statement, do this as probability calculation and allow dynamic addition of various modifiers, like availability of ventilators etc.

Create CLI tool

We could add CLI tool that would look like covasim --population popdict.pickle --parameters params.yaml --output /someoutputdir that would generate dataset and plots

Clean up controller with form model

Currently the logic for the run_sim endpoint is quite large. Lets move the params validation and cleaning into a form model.

covasim/covasim/webapp/cova_app.py

Lines 126 to 175 in cfc7300

    
               orig_pars = cv.make_pars() 
        
               defaults = get_defaults(merge=True) 
        
               web_pars = {} 
        
               web_pars['verbose'] = verbose # Control verbosity here 
        
               for key,entry in {**sim_pars, **epi_pars}.items(): 
        
                   print(key, entry) 
        
                   best   = defaults[key]['best'] 
        
                   minval = defaults[key]['min'] 
        
                   maxval = defaults[key]['max'] 
        
                   try: 
        
                       web_pars[key] = np.clip(float(entry['best']), minval, maxval) 
        
                   except Exception: 
        
                       user_key = entry['name'] 
        
                       user_val = entry['best'] 
        
                       err1 = f'Could not convert parameter "{user_key}", value "{user_val}"; using default value instead\n' 
        
                       print(err1) 
        
                       err += err1 
        
                       web_pars[key] = best 
        
                   if key in sim_pars: sim_pars[key]['best'] = web_pars[key] 
        
                   else:               epi_pars[key]['best'] = web_pars[key] 
        
               # Convert durations 
        
               web_pars['dur'] = sc.dcp(orig_pars['dur']) # This is complicated, so just copy it 
        
               web_pars['dur']['exp2inf']['par1']  = web_pars.pop('web_exp2inf') 
        
               web_pars['dur']['inf2sym']['par1']  = web_pars.pop('web_inf2sym') 
        
               web_pars['dur']['crit2die']['par1'] = web_pars.pop('web_timetodie') 
        
               web_dur = web_pars.pop('web_dur') 
        
               for key in ['asym2rec', 'mild2rec', 'sev2rec', 'crit2rec']: 
        
                   web_pars['dur'][key]['par1'] = web_dur 
        
               # Add the intervention 
        
               web_pars['interventions'] = [] 
        
               if web_pars['web_int_day'] is not None: 
        
                   web_pars['interventions'] = cv.change_beta(days=web_pars.pop('web_int_day'), changes=(1-web_pars.pop('web_int_eff'))) 
        
               # Handle CFR -- ignore symptoms and set to 1 
        
               prog_pars = cv.get_default_prognoses(by_age=False) 
        
               web_pars['rel_symp_prob']   = 1.0/prog_pars.symp_prob 
        
               web_pars['rel_severe_prob'] = 1.0/prog_pars.severe_prob 
        
               web_pars['rel_crit_prob']   = 1.0/prog_pars.crit_prob 
        
               web_pars['rel_death_prob']  = web_pars.pop('web_cfr')/prog_pars.death_prob 
        
           except Exception as E: 
        
               err2 = f'Parameter conversion failed! {str(E)}\n' 
        
               print(err2) 
        
               err += err2

We can also add unit testing for the form model once it is created.

Add a contributing doc

There are beginning to be a lot of scattered readme's I propose we put together a doc focused on getting contributors up and running. It should include at least the following

Some examples:
https://github.com/integrations/slack/blob/master/CONTRIBUTING.md
https://github.com/integrations/slack/blob/master/CODE_OF_CONDUCT.md

GitHub has some starter template as well:
https://help.github.com/en/github/building-a-strong-community/creating-a-default-community-health-file
https://help.github.com/en/github/building-a-strong-community/adding-a-code-of-conduct-to-your-project

We may also want to create some templates to encourage proper formating.
https://help.github.com/en/github/building-a-strong-community/configuring-issue-templates-for-your-repository
https://help.github.com/en/github/building-a-strong-community/creating-a-pull-request-template-for-your-repository

Add front end tooling

Currently we have a very simple front end with a single js file and a single css file. Do we want to add some level of testing, package management and build tooling?

Move checking bed_constraints to check_severe function

We shoudl refactor death calculation with bed constraints (and other medical constraints, like ventilators). Right now death is determined in infect function, so at the time of infection. Beds can run out between infection and severe symptoms onset. We should check availability of beds when severe symptoms occur, and if that comes out false, modify probability of death.

Also, rather than determining death probability in infect, we should re-evaluate every epoch.

After that is done, we should uncomment these tests for CI

	orig_pars = cv.make_pars()
	defaults = get_defaults(merge=True)
	web_pars = {}
	web_pars['verbose'] = verbose # Control verbosity here


	for key,entry in {sim_pars, epi_pars}.items():
	print(key, entry)

	best = defaults[key]['best']
	minval = defaults[key]['min']
	maxval = defaults[key]['max']

	try:
	web_pars[key] = np.clip(float(entry['best']), minval, maxval)
	except Exception:
	user_key = entry['name']
	user_val = entry['best']
	err1 = f'Could not convert parameter "{user_key}", value "{user_val}"; using default value instead\n'
	print(err1)
	err += err1
	web_pars[key] = best
	if key in sim_pars: sim_pars[key]['best'] = web_pars[key]
	else: epi_pars[key]['best'] = web_pars[key]

	# Convert durations
	web_pars['dur'] = sc.dcp(orig_pars['dur']) # This is complicated, so just copy it
	web_pars['dur']['exp2inf']['par1'] = web_pars.pop('web_exp2inf')
	web_pars['dur']['inf2sym']['par1'] = web_pars.pop('web_inf2sym')
	web_pars['dur']['crit2die']['par1'] = web_pars.pop('web_timetodie')
	web_dur = web_pars.pop('web_dur')
	for key in ['asym2rec', 'mild2rec', 'sev2rec', 'crit2rec']:
	web_pars['dur'][key]['par1'] = web_dur

	# Add the intervention
	web_pars['interventions'] = []
	if web_pars['web_int_day'] is not None:
	web_pars['interventions'] = cv.change_beta(days=web_pars.pop('web_int_day'), changes=(1-web_pars.pop('web_int_eff')))

	# Handle CFR -- ignore symptoms and set to 1
	prog_pars = cv.get_default_prognoses(by_age=False)
	web_pars['rel_symp_prob'] = 1.0/prog_pars.symp_prob
	web_pars['rel_severe_prob'] = 1.0/prog_pars.severe_prob
	web_pars['rel_crit_prob'] = 1.0/prog_pars.crit_prob
	web_pars['rel_death_prob'] = web_pars.pop('web_cfr')/prog_pars.death_prob

	except Exception as E:
	err2 = f'Parameter conversion failed! {str(E)}\n'
	print(err2)
	err += err2