Code Monkey home page Code Monkey logo

covasim's People

Contributors

adityasharad avatar chjones-idm avatar ckerr-idm avatar cliffckerr avatar cwiswell-idm avatar daniel-klein avatar dependabot[bot] avatar devclinton avatar deviozc avatar dlukacevic-idm avatar dmistry-idm avatar ghart-idm avatar gwincr11 avatar hamelsmu avatar imcatta avatar jamiecohen avatar jps1 avatar jschripsema-idm avatar jules2689 avatar krosenfeld-idm avatar lmgeorge avatar mfisher-idmod avatar pausz avatar pselvaraj87 avatar rasmuswl avatar rbelew avatar robynstuart avatar romesha avatar sayers24 avatar sbuxton-idm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covasim's Issues

CD pipeline

Whenever we release, build, push and update pypi, assuming infra is ready we could also automatically deploy new version to Azure

UI - Support for interventions for specific time frames

The user group wants to know the output when an intervention is implemented, lifted, then re-implemented at a future date.
Possible solution: A range slider for intervention-start-day with a "+" button to add a new point on the slider for intervention-end-day. The "+" button could then add a third point for a re-implementation day.

Project management tooling

We are starting to get more people involved in this project. It maybe nice to allow some people in the org do some simple project management to make it easier to tell what is open to be worked on and what is in progress.

Proposal:

  • Setup a simple project board with the backlog, in progress and finished columns. Setup automation on this board to automatically update when possible.
  • Allow some of the core contributors to self assign issues.
  • Determine who is in charge of reviewing and merging code.
  • Define a process for requesting reviews and reviewing/merging items.

Automatic calibration

Once #43 is done, we need to calibrate to the data. The Sim object already has a likelihood() method, but this needs to be improved (more data types, probably using cumulative rather than daily counts, etc.). We also need to decide what model parameters will be used for fitting -- potentially just beta and n_infected, but if others, may need a smarter optimization approach. We are almost there with a fancy method called BINNTS (bootstrapped iterative nearest-neighbor threshold sampling), but it's not quite ready to go, so could also look into an off-the-shelf method, especially if we just want the MLE and don't need the whole posterior.

Enable multiprocessing

Currently the simulation runs only on single core. Can we add multiprocessing to make use of more workers and run the simulation faster?

Improve code performance

Investigate ways to reduce model run time -- more use of arrays rather than loops, sparse matrices, Numba, etc.

Include healthsystems into web

Currently, there are no configurations in webapp to address eventual shortages of, for example, ICU beds. It would be great to include these into webapp. This will mean that when severe cases pass pre-configured threshold, mortality increases.

Improve data and likelihood functions

Avoid hard coding, more flexible file reading, and better likelihood calculations

  • Load CSV files
  • Load all columns
  • Automatic column matching
  • Include in plotting / remove data_mapping
  • Extend likelihood function and use cumulative counts
  • Write data file documentation / include example

Receiving type error at runtime

Hi -
After I set up the project I was able to run the first example, but not the second (python examples/run_sim.py) due to what looks like a data type mismatch. Because I didn't specify any data types in the code, I wonder if this is a bug?

See my command prompt code and log below:

(covasim) C:\Users\laurel\covasim>python examples/run_sim.py
Importing...
Covasim 0.22.0 (2020-03-31) — © 2020 by IDM
Note: synthpops (for detailed demographic data) is not available (No module named 'synthpops')

Elapsed time: 3.06 s
Making sim...
Running...
TEMP init people
  Creating 20000.0 people...
  Created 20000 people, average age 38.58 years
  Running day 0 of 60 (0.60 s elapsed)...
  Running day 1 of 60 (0.60 s elapsed)...
  Running day 2 of 60 (0.60 s elapsed)...
Traceback (most recent call last):
  File "examples/run_sim.py", line 35, in <module>
    sim.run(verbose=verbose)
  File "C:\Users\laurel\covasim\covasim\sim.py", line 396, in run
    transmission_inds = cvu.bf(thisbeta, person.contacts)
  File "C:\Users\laurel\covasim\lib\site-packages\numba-0.48.0-py3.8-win32.egg\numba\dispatcher.py", line 574, in _explain_matching_error
    raise TypeError(msg)
TypeError: No matching definition for argument type(s) float64, array(int32, 1d, C)

Enable UI ground truth overlay

Since we'll have latest data cached by #48 and countries selectable by #44, we can overlay number of confirmed cases, deaths and recoveries over simulation

Accessibility - input labels/button names/color contrast/keyboard nav

  • WCAG 4.1.2 - buttons must have discernible text
  • WCAG 1.3.1, WCAG 3.3.2 - input controls must be associated with their labels
  • WCAG 1.4.3 - contrast ratio between text and background must meet AA standards
  • Keyboard navigation - tooltip information should be accessible for non-mouse users

Consider using logger instead of verbose

Currently, stdout is managed by a verbose variable passed between functions. This works, but better options might be available. Consider replacing with logger for greater flexibility.

Requirements:

  • Equally easy to configure, e.g. replacing sim.run(verbose=1) with sim.run(verbose='info') is OK; from covasim.utils import logger; logger.setLevel('INFO'); sim.run() is not
  • Same levels of detail are implemented: e.g. 0 = only warnings/errors to stdout, 1 = default output, 2 = print everything possible (debug).

@RomeshA do you want to take a look at this? cc @gwincr11

Scrape best available epi data

This is an involved project and may even require its own repo, but creating an issue here to get the conversation started. The task is:

We need the best available auto-updated epidemiological data at as fine a geographical resolution as possible.

Specifically, the data we need is as many of the following as possible, in order of importance:

  1. Number of deaths (on date died)
  2. Number of positive diagnoses (on date test performed)
  3. Number of importations (especially at start of outbreak)
  4. Number of people hospitalized (on date hospitalized)
  5. Number of people in ICU (on date admitted to ICU)
  6. Number of negative diagnoses (on date test performed)

There are various tools that already collate some of this, e.g. https://neherlab.org/covid19/ and https://coronavirus.jhu.edu/map.html. The task is to find the best available data sources and collate everything into a consistent format. Top priority is Africa and LMIC countries, but as broad as coverage as possible.

Probabilities- 1.0 infection, symptomatic, severity, critical, death probabilities should not have recoveries

Found in regressing test cases.

In this simulation, every agent starts as infected, and the probabilities should put all of them in the fatal bucket:

    "n": 500,
    "n_infected": 500,
    "rel_crit_prob": 1.0,
    "rel_death_prob": 1.0,
    "rel_severe_prob": 1.0,
    "rel_symp_prob": 1.0,

The durations for each of those phases should be a constant 0 (a couple of examples)

       "crit2die": {
            "dist": "normal",
            "par1": 1,
            "par2": 0
        },
        "exp2inf": {
            "dist": "normal",
            "par1": 0,
            "par2": 0
        },

(I think I got all of them) but when the simulation runs, we only see 6 deaths and 494 recoveries.

Steps to repro:
Convert the attached TEST_sim_from_results_json.txt to a .py file
copy that .py file and the attached txt file into a directory where covasim is installed
Run the script and look at the console output

EXPECTED: 500 agents, 500 deaths
CURRENTLY: 500 agents, 6 deaths, 494 recoveries
TEST_sim_from_results_json.txt

DEBUG_test_disease_progression.DiseaseProgressionTests.txt

Make image smaller

Currently our image is multi-GB in size. Waaay to much for this project.

Scrape best available demographics data

Currently, Covasim with the usepopdata = False option uses a hard-coded age distribution from the US:
https://github.com/institutefordiseasemodeling/covasim/blob/develop/covasim/people.py#L315

We need to make this adaptable to as many different locations as possible. The best source for data o this is the UN World Pop:

https://population.un.org/wpp/

We need a simple way of being able to choose the population data (i.e., set the age_data variable) based on any country in the world, e.g. make_randpop(sim, location='ethiopia').

[UI] Improve frontend plotting

Related to #51, we need to think about how to fix plotting. Here's how plotting looks on the BE:
image

At minimum we need to plot data, but other changes would be good too. Options:

  • Keep Plotly and expand it
  • Move to Altair
  • Switch back to mpld3, so we can exactly reproduce backend figures

Upper limit of 56,800,235 people in the simulation

When generating random population, it uses sciris.uuid() to generate unique IDs for the population. Due to safety attribute infast_uuid() function in sciris it limits the simulation to run only for 56 million people. By default it should be able to generate 56,800,235,584 UIDs.

This is the error
ValueError: With a UID of type "ascii" and length 6, there are 56800235584 possible UIDs, and you requested 100000000, which exceeds the maximum allowed (56800235)

Scrape workplace/industry data immediately

Got a request to look at reopening workplaces, perhaps by industry type or size. So, I could use a pair of hands or more to look at https://www.bls.gov/oes/current/oes_42660.htm and grab data from there on workplaces. Specifically data on age of workers, types of industry, workplace sizes (by industry too if available), at the finest granularity you can find (this page points at the Seattle-Tacoma-Bellevue metro area). Simple csvs that can be read in as pandas tables would be great. Please reach out if you can do this and are not working on modeling itself.

Move popdict to dataframe

Right now pop dict is just that - dict. By moving this to dataftame we'll have access to better querying via pandas or full parallelization via dask later.

Enable full state runs with scaling

Covasim can run pretty well with populations up to ~100k. This is statistically enough to model even multi-milion pops, assuming we can generate sample similar to overall population.

We can include functions that will automatically scale results and iterations to show numbers in real state population. So, create 100k pop, calculate fraction of actual pop (say, for 5M state, it's going to be 0.02) and multiply all numbers by it.

@cliff one open question I'd have in this approach would be how to handle initial infections? Should we also scale them and round to ceiling? So for 10 initially infectious in 5M state, during sims we'll run 1? That will skew results quite drastically if I'm correct. Ideas?

[CI] Compare each run against baseline

After #26 is done, we can add CI job that will compare results. It can generate it's own plot and show both of them next to each other and show difference in totals

Refactor person.py:Person.infect method

Right now this method is daisy chain of if statement, that result in personalized probabilities of death etc. We could refactor it to, instead of if statement, do this as probability calculation and allow dynamic addition of various modifiers, like availability of ventilators etc.

Create CLI tool

We could add CLI tool that would look like covasim --population popdict.pickle --parameters params.yaml --output /someoutputdir that would generate dataset and plots

Clean up controller with form model

Currently the logic for the run_sim endpoint is quite large. Lets move the params validation and cleaning into a form model.

orig_pars = cv.make_pars()
defaults = get_defaults(merge=True)
web_pars = {}
web_pars['verbose'] = verbose # Control verbosity here
for key,entry in {**sim_pars, **epi_pars}.items():
print(key, entry)
best = defaults[key]['best']
minval = defaults[key]['min']
maxval = defaults[key]['max']
try:
web_pars[key] = np.clip(float(entry['best']), minval, maxval)
except Exception:
user_key = entry['name']
user_val = entry['best']
err1 = f'Could not convert parameter "{user_key}", value "{user_val}"; using default value instead\n'
print(err1)
err += err1
web_pars[key] = best
if key in sim_pars: sim_pars[key]['best'] = web_pars[key]
else: epi_pars[key]['best'] = web_pars[key]
# Convert durations
web_pars['dur'] = sc.dcp(orig_pars['dur']) # This is complicated, so just copy it
web_pars['dur']['exp2inf']['par1'] = web_pars.pop('web_exp2inf')
web_pars['dur']['inf2sym']['par1'] = web_pars.pop('web_inf2sym')
web_pars['dur']['crit2die']['par1'] = web_pars.pop('web_timetodie')
web_dur = web_pars.pop('web_dur')
for key in ['asym2rec', 'mild2rec', 'sev2rec', 'crit2rec']:
web_pars['dur'][key]['par1'] = web_dur
# Add the intervention
web_pars['interventions'] = []
if web_pars['web_int_day'] is not None:
web_pars['interventions'] = cv.change_beta(days=web_pars.pop('web_int_day'), changes=(1-web_pars.pop('web_int_eff')))
# Handle CFR -- ignore symptoms and set to 1
prog_pars = cv.get_default_prognoses(by_age=False)
web_pars['rel_symp_prob'] = 1.0/prog_pars.symp_prob
web_pars['rel_severe_prob'] = 1.0/prog_pars.severe_prob
web_pars['rel_crit_prob'] = 1.0/prog_pars.crit_prob
web_pars['rel_death_prob'] = web_pars.pop('web_cfr')/prog_pars.death_prob
except Exception as E:
err2 = f'Parameter conversion failed! {str(E)}\n'
print(err2)
err += err2

We can also add unit testing for the form model once it is created.

Add a contributing doc

There are beginning to be a lot of scattered readme's I propose we put together a doc focused on getting contributors up and running. It should include at least the following

  • Welcome to the project.
  • Links to important documents
    • Issues
    • Docs
    • Maybe a glossary of terms
  • Overview of types of contributions we are looking for.
  • Communicating in GitHub
  • Setup
    • Docker
    • Non-docker
  • Running tests
  • Reporting bugs
  • How to submit a pr
  • Making a feature request
  • Where can I get help?
  • Code of conduct

Some examples:
https://github.com/integrations/slack/blob/master/CONTRIBUTING.md
https://github.com/integrations/slack/blob/master/CODE_OF_CONDUCT.md

GitHub has some starter template as well:
https://help.github.com/en/github/building-a-strong-community/creating-a-default-community-health-file
https://help.github.com/en/github/building-a-strong-community/adding-a-code-of-conduct-to-your-project

We may also want to create some templates to encourage proper formating.
https://help.github.com/en/github/building-a-strong-community/configuring-issue-templates-for-your-repository
https://help.github.com/en/github/building-a-strong-community/creating-a-pull-request-template-for-your-repository

Add front end tooling

Currently we have a very simple front end with a single js file and a single css file. Do we want to add some level of testing, package management and build tooling?

Move checking bed_constraints to check_severe function

We shoudl refactor death calculation with bed constraints (and other medical constraints, like ventilators). Right now death is determined in infect function, so at the time of infection. Beds can run out between infection and severe symptoms onset. We should check availability of beds when severe symptoms occur, and if that comes out false, modify probability of death.

Also, rather than determining death probability in infect, we should re-evaluate every epoch.

[CI] Create baseline with actions

Some tests will produce actual simulation results.

Every time PR is merged, we can run these and save results as json and generate plot. This can be commited to repo via actions. This will produce baseline that will allow us to compare PR runs against it and generate report to help reviewers

[CI] Uncomment /tests/unittests

Currently we don't run these because of how strict they are with changes of simulation results. Because this is stochastic models there is a lot of uncertainty about how it actually look like at the end (within reason).

We should refactor unittests in a way that they will always succeed unless someone makes meaningful change to results (one example would be, if mortality is 0, number of deaths should always be 0).

After that is done, we should uncomment these tests for CI

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.