institutefordiseasemodeling / covasim Goto Github PK
View Code? Open in Web Editor NEWCOVID-19 Agent-based Simulator (Covasim): a model for exploring coronavirus dynamics and interventions
Home Page: https://covasim.org
License: MIT License
COVID-19 Agent-based Simulator (Covasim): a model for exploring coronavirus dynamics and interventions
Home Page: https://covasim.org
License: MIT License
Whenever we release, build, push and update pypi, assuming infra is ready we could also automatically deploy new version to Azure
Currently webapp runs on single VM. We should set it up in highly available, load balanced and otherwise better way.
The user group wants to know the output when an intervention is implemented, lifted, then re-implemented at a future date.
Possible solution: A range slider for intervention-start-day with a "+" button to add a new point on the slider for intervention-end-day. The "+" button could then add a third point for a re-implementation day.
We are starting to get more people involved in this project. It maybe nice to allow some people in the org do some simple project management to make it easier to tell what is open to be worked on and what is in progress.
Proposal:
Once #43 is done, we need to calibrate to the data. The Sim
object already has a likelihood()
method, but this needs to be improved (more data types, probably using cumulative rather than daily counts, etc.). We also need to decide what model parameters will be used for fitting -- potentially just beta
and n_infected
, but if others, may need a smarter optimization approach. We are almost there with a fancy method called BINNTS (bootstrapped iterative nearest-neighbor threshold sampling), but it's not quite ready to go, so could also look into an off-the-shelf method, especially if we just want the MLE and don't need the whole posterior.
Currently the simulation runs only on single core. Can we add multiprocessing to make use of more workers and run the simulation faster?
Checkout action fails, probably after move from private repo
Investigate ways to reduce model run time -- more use of arrays rather than loops, sparse matrices, Numba, etc.
Need a button to upload data. Format of data should be like
https://github.com/InstituteforDiseaseModeling/covasim/blob/master/tests/example_data.csv
in either Excel or CSV format. See the test
https://github.com/InstituteforDiseaseModeling/covasim/blob/master/tests/test_sim.py#L121
for usage. If uploaded, data should plot against sim results (as happens on the backend).
Related issue: #50
Currently, there are no configurations in webapp to address eventual shortages of, for example, ICU beds. It would be great to include these into webapp. This will mean that when severe cases pass pre-configured threshold, mortality increases.
Avoid hard coding, more flexible file reading, and better likelihood calculations
Followed the instructions, but got the error in the title when trying to import covasim. Same error if I install in terminal and try to run a sim.
Thanks!
#45 This, but on UI
Plotly, while functional, lacks some features like interactive charts and well, it's not quite as "snazzy" as, for example, https://altair-viz.github.io/getting_started/overview.html
We could move our UI charts to it
Right now https://github.com/institutefordiseasemodeling/covasim/blob/develop/covasim/people.py#L315 is hardcoded on Seattle. We could use https://github.com/neherlab/covid19_scenarios/blob/da1381c429471c3d57b02cd5a7b6dc46e645c0b3/src/assets/data/country_age_distribution.json to get per-country ratios. Later we can write our own scraper.
Work from @aouedraogo and @lskrip-IDM 's recent report to get oxygen support and other health system parameters built into Covasim
Use census data to generate age distributions per state, this will enable us to get integrated into unified UI quickly
Hi -
After I set up the project I was able to run the first example, but not the second (python examples/run_sim.py) due to what looks like a data type mismatch. Because I didn't specify any data types in the code, I wonder if this is a bug?
See my command prompt code and log below:
(covasim) C:\Users\laurel\covasim>python examples/run_sim.py
Importing...
Covasim 0.22.0 (2020-03-31) — © 2020 by IDM
Note: synthpops (for detailed demographic data) is not available (No module named 'synthpops')
Elapsed time: 3.06 s
Making sim...
Running...
TEMP init people
Creating 20000.0 people...
Created 20000 people, average age 38.58 years
Running day 0 of 60 (0.60 s elapsed)...
Running day 1 of 60 (0.60 s elapsed)...
Running day 2 of 60 (0.60 s elapsed)...
Traceback (most recent call last):
File "examples/run_sim.py", line 35, in <module>
sim.run(verbose=verbose)
File "C:\Users\laurel\covasim\covasim\sim.py", line 396, in run
transmission_inds = cvu.bf(thisbeta, person.contacts)
File "C:\Users\laurel\covasim\lib\site-packages\numba-0.48.0-py3.8-win32.egg\numba\dispatcher.py", line 574, in _explain_matching_error
raise TypeError(msg)
TypeError: No matching definition for argument type(s) float64, array(int32, 1d, C)
Support for the cova_app.py to accept the necessary parameters for the intervention data
Building on the list of interventions listed here:
https://docs.google.com/document/d/1Q3-cLk9lo8Oizgs4JZTnqDCBjWu3oE93S42B9rSinfY/edit#
Provide a list of interventions that would be most useful to Africa CDC (e.g., different assumptions about social distancing and testing capacity)
After #9 is done, we should deploy new website and point domain to it
Currently, stdout is managed by a verbose
variable passed between functions. This works, but better options might be available. Consider replacing with logger for greater flexibility.
Requirements:
sim.run(verbose=1)
with sim.run(verbose='info')
is OK; from covasim.utils import logger; logger.setLevel('INFO'); sim.run()
is notGH action that, whenever we release new version, will build and publish docker image and push new package to pypi
Currently https://github.com/InstituteforDiseaseModeling/covasim/blob/master/covasim/sim.py#L266 is really long function. We could try to break it up into smaller pieces, easier to test
This is an involved project and may even require its own repo, but creating an issue here to get the conversation started. The task is:
We need the best available auto-updated epidemiological data at as fine a geographical resolution as possible.
Specifically, the data we need is as many of the following as possible, in order of importance:
There are various tools that already collate some of this, e.g. https://neherlab.org/covid19/ and https://coronavirus.jhu.edu/map.html. The task is to find the best available data sources and collate everything into a consistent format. Top priority is Africa and LMIC countries, but as broad as coverage as possible.
Found in regressing test cases.
In this simulation, every agent starts as infected, and the probabilities should put all of them in the fatal bucket:
"n": 500,
"n_infected": 500,
"rel_crit_prob": 1.0,
"rel_death_prob": 1.0,
"rel_severe_prob": 1.0,
"rel_symp_prob": 1.0,
The durations for each of those phases should be a constant 0 (a couple of examples)
"crit2die": {
"dist": "normal",
"par1": 1,
"par2": 0
},
"exp2inf": {
"dist": "normal",
"par1": 0,
"par2": 0
},
(I think I got all of them) but when the simulation runs, we only see 6 deaths and 494 recoveries.
Steps to repro:
Convert the attached TEST_sim_from_results_json.txt to a .py file
copy that .py file and the attached txt file into a directory where covasim is installed
Run the script and look at the console output
EXPECTED: 500 agents, 500 deaths
CURRENTLY: 500 agents, 6 deaths, 494 recoveries
TEST_sim_from_results_json.txt
Run, get graphs, get space at bottom
Currently our image is multi-GB in size. Waaay to much for this project.
For now best data we know of is available https://github.com/CSSEGISandData/COVID-19 here. We can pull time series data, process it and cache it locally with scheduled GitHub actions.
Currently, Covasim with the usepopdata = False
option uses a hard-coded age distribution from the US:
https://github.com/institutefordiseasemodeling/covasim/blob/develop/covasim/people.py#L315
We need to make this adaptable to as many different locations as possible. The best source for data o this is the UN World Pop:
https://population.un.org/wpp/
We need a simple way of being able to choose the population data (i.e., set the age_data
variable) based on any country in the world, e.g. make_randpop(sim, location='ethiopia')
.
Related to #51, we need to think about how to fix plotting. Here's how plotting looks on the BE:
At minimum we need to plot data, but other changes would be good too. Options:
1 pick country/region
2 specify duration
3 specify interventions (from #71)
4 (optional) advanced - rest of params
When generating random population, it uses sciris.uuid()
to generate unique IDs for the population. Due to safety
attribute infast_uuid()
function in sciris
it limits the simulation to run only for 56 million people. By default it should be able to generate 56,800,235,584 UIDs.
This is the error
ValueError: With a UID of type "ascii" and length 6, there are 56800235584 possible UIDs, and you requested 100000000, which exceeds the maximum allowed (56800235)
Got a request to look at reopening workplaces, perhaps by industry type or size. So, I could use a pair of hands or more to look at https://www.bls.gov/oes/current/oes_42660.htm and grab data from there on workplaces. Specifically data on age of workers, types of industry, workplace sizes (by industry too if available), at the finest granularity you can find (this page points at the Seattle-Tacoma-Bellevue metro area). Simple csvs that can be read in as pandas tables would be great. Please reach out if you can do this and are not working on modeling itself.
Right now pop dict is just that - dict. By moving this to dataftame we'll have access to better querying via pandas or full parallelization via dask later.
Covasim can run pretty well with populations up to ~100k. This is statistically enough to model even multi-milion pops, assuming we can generate sample similar to overall population.
We can include functions that will automatically scale results and iterations to show numbers in real state population. So, create 100k pop, calculate fraction of actual pop (say, for 5M state, it's going to be 0.02) and multiply all numbers by it.
@cliff one open question I'd have in this approach would be how to handle initial infections? Should we also scale them and round to ceiling? So for 10 initially infectious in 5M state, during sims we'll run 1? That will skew results quite drastically if I'm correct. Ideas?
See base.py:BaseSim
for to_json()
and to_excel()
methods. Need to add these to the Scenarios class (run.py
).
After #26 is done, we can add CI job that will compare results. It can generate it's own plot and show both of them next to each other and show difference in totals
Right now this method is daisy chain of if statement, that result in personalized probabilities of death etc. We could refactor it to, instead of if statement, do this as probability calculation and allow dynamic addition of various modifiers, like availability of ventilators etc.
We could add CLI tool that would look like covasim --population popdict.pickle --parameters params.yaml --output /someoutputdir
that would generate dataset and plots
Currently the logic for the run_sim
endpoint is quite large. Lets move the params validation and cleaning into a form model.
covasim/covasim/webapp/cova_app.py
Lines 126 to 175 in cfc7300
We can also add unit testing for the form model once it is created.
There are beginning to be a lot of scattered readme's I propose we put together a doc focused on getting contributors up and running. It should include at least the following
Some examples:
https://github.com/integrations/slack/blob/master/CONTRIBUTING.md
https://github.com/integrations/slack/blob/master/CODE_OF_CONDUCT.md
GitHub has some starter template as well:
https://help.github.com/en/github/building-a-strong-community/creating-a-default-community-health-file
https://help.github.com/en/github/building-a-strong-community/adding-a-code-of-conduct-to-your-project
We may also want to create some templates to encourage proper formating.
https://help.github.com/en/github/building-a-strong-community/configuring-issue-templates-for-your-repository
https://help.github.com/en/github/building-a-strong-community/creating-a-pull-request-template-for-your-repository
Currently we have a very simple front end with a single js file and a single css file. Do we want to add some level of testing, package management and build tooling?
We shoudl refactor death calculation with bed constraints (and other medical constraints, like ventilators). Right now death is determined in infect
function, so at the time of infection. Beds can run out between infection and severe symptoms onset. We should check availability of beds when severe symptoms occur, and if that comes out false, modify probability of death.
Also, rather than determining death probability in infect
, we should re-evaluate every epoch.
To be started after official release
Some tests will produce actual simulation results.
Every time PR is merged, we can run these and save results as json and generate plot. This can be commited to repo via actions. This will produce baseline that will allow us to compare PR runs against it and generate report to help reviewers
Currently we don't run these because of how strict they are with changes of simulation results. Because this is stochastic models there is a lot of uncertainty about how it actually look like at the end (within reason).
We should refactor unittests in a way that they will always succeed unless someone makes meaningful change to results (one example would be, if mortality is 0, number of deaths should always be 0).
After that is done, we should uncomment these tests for CI
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.