Code Monkey home page Code Monkey logo

scenarios's Introduction

scenarios: Compare epidemic scenarios

License: MIT R-CMD-check Codecov test coverage Project Status: Suspended – Initial development has started, but there has not yet been a stable, usable release; work has been stopped for the time being but the author(s) intend on resuming work. CRAN status

scenarios was intended to provide functions to compare the outcomes of epidemic modelling simulations.

The development of scenarios has been suspended in favour of increased scenario modelling and comparison functionality coming to the epidemics package. Development may resume once a use case for a separate comparison package is clearer.

Installation

You can install the development version of scenarios from GitHub with:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("epiverse-trace/scenarios")

Quick start

The examples below show the existing functionality; this is not currently planned to be developed further.

An example with finalsize

Define an epidemic model scenario by creating a new scenario object. The standard workflow needs a model function, such as finalsize::final_size(), and appropriate arguments to the function.

# load scenarios
library(scenarios)

# create a scenario for pandemic-potential influenza
# with finalsize::final_size() as the model function
scenario_pandemic_flu <- scenario(
  model_function = "finalsize::final_size",
  parameters = make_parameters_finalsize_UK(), # a helper function
  replicates = 3
)

View a summary of the scenario.

scenario_pandemic_flu
#> Epidemic scenario object
#>  Scenario name: No name specified (NA)
#>  Model function: finalsize::final_size
#>  Extra information on: 
#>  Scenario replicates: 3
#>  Scenario outcomes are not prepared

The scenario object is created but the model function is not initially run. This can be checked using a helper function. Tip: Many helper functions have the prefix sce_ to help them be found quickly when using autocomplete in various text editors and IDEs.

# check whether the scenario data have been generated
sce_has_data(scenario_pandemic_flu)
#> [1] FALSE

Model outcome data can be generated by running the model function with the parameters specified in the scenario.

scenario_pandemic_flu <- run_scenario(scenario_pandemic_flu)

Take a peek at the column names in the model outcome replicates.

sce_peek_outcomes(scenario_pandemic_flu)
#>       demo_grp       susc_grp susceptibility     p_infected      replicate 
#>    "character"    "character"      "numeric"      "numeric"      "integer"

Get the outcomes from all replicates as a single dataset, or aggregate an outcome variable of interest across replicates by some grouping variable.

# get all output
head(sce_get_outcomes(scenario_pandemic_flu))
#>    demo_grp   susc_grp susceptibility p_infected replicate
#>      <char>     <char>          <num>      <num>     <int>
#> 1:   [0,20) susc_grp_1              1  0.6544866         1
#> 2:  [20,40) susc_grp_1              1  0.5750030         1
#> 3:      40+ susc_grp_1              1  0.4588871         1
#> 4:   [0,20) susc_grp_1              1  0.6544866         2
#> 5:  [20,40) susc_grp_1              1  0.5750030         2
#> 6:      40+ susc_grp_1              1  0.4588871         2

# aggregate proportion infected by demographic group
# NOTE that all replicates have the same outcome in this deterministic model
sce_aggregate_outcomes(
  x = scenario_pandemic_flu,
  grouping_variables = c("demo_grp"),
  measure_variables = c("p_infected"),
  summary_functions = c("mean", "min", "max")
)
#> Key: <demo_grp>
#>    demo_grp p_infected_mean p_infected_min p_infected_max
#>      <char>           <num>          <num>          <num>
#> 1:      40+       0.4588871      0.4588871      0.4588871
#> 2:   [0,20)       0.6544866      0.6544866      0.6544866
#> 3:  [20,40)       0.5750030      0.5750030      0.5750030

An example with epidemics

This example shows the same workflow applied to a simple, deterministic epidemic model from the epidemics package.

# create a new scenario
scenario_sir <- scenario(
  model_function = "epidemics::sir_desolve",
  parameters = make_parameters_SIR_epidemic(), # a helper function
  replicates = 5L
)

# view the initial conditions and infection parameters
sce_get_information(scenario_sir, which = c("init", "parms"))
#> $init
#>    S    I    R 
#> 0.99 0.01 0.00 
#> 
#> $parms
#>  beta gamma 
#>   1.0   0.1

# generate scenario outcomes by running the model
scenario_sir <- run_scenario(scenario_sir)

# peek at the outcomes
sce_peek_outcomes(scenario_sir)
#>       time      state proportion  replicate 
#>  "numeric"   "factor"  "numeric"  "integer"

# view the aggregated outcomes
# this is the per-timestep, per-class (S, I, R) mean proportion
# and is the same across replicates in this deterministic model
tail(
  sce_aggregate_outcomes(
    scenario_sir,
    grouping_variables = c("time", "state"),
    measure_variables = "proportion",
    summary_functions = "mean"
  )
)
#> Key: <time, state>
#>     time  state proportion_mean
#>    <num> <fctr>           <num>
#> 1:    99      S    4.501519e-05
#> 2:    99      I    8.643038e-05
#> 3:    99      R    9.998686e-01
#> 4:   100      S    4.501149e-05
#> 5:   100      I    7.820928e-05
#> 6:   100      R    9.998768e-01

Help

To report a bug please open an issue; please note that development on scenarios has been suspended.

Contribute

Development on scenarios has been suspended.

However, use cases or requirements for a package that helps compare outcomes of epidemic scenario models are very welcome as issues, or on the main Epiverse Discussion board.

Please follow the package contributing guide.

Code of conduct

Please note that the scenarios project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

scenarios's People

Contributors

actions-user avatar bisaloo avatar pratikunterwegs avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

bahadzie

scenarios's Issues

Function to add information to scenarios

This issue is to request a function, potentially called sce_add_info which allows the addition of more information to the extra_info list in a scenario class.

Option for time-stamp in scenario object (i.e. versioning as metadata)

This package is going to be very useful for managing prospective scenario modelling (i.e. 'what if?' questions about future dynamics), especially as there can be a lot of iteration involved in analysis (e.g. early COVID scenarios for UK and subsequent variant and roadmap scenarios).

Had a couple of questions from a design perspective. Historically, scenario assumptions are often recorded in ad-hoc wrappers, and versions tracked via commits or ad-hoc versioning (if at all). So as this package takes shape, wondered if useful to have this as metadata (perhaps with a stamp in the object as simple first pass, if it's structured this way)?

There's also the related issue of scenario versions passed between models, e.g. the branching process transmission step in ringbp (used for early COVID contact tracing analysis) was later used to form the transmission process in the covidhm model (used for community network analysis) with subsequent version used in event outbreak analysis. Feels like this package would be nice way to pass shared modularised assumptions between models (if feasible)?

Design of the `comparisons` class

This issue is intended for design discussions relating to the comparisons class.

The comparisonsclass is intended to be an S3 class object that holds comparisons between epidemic scenario modelling outputs, which are represented as scenario objects. The idea is for this object to ingest several scenario objects as a list, set a user-specified scenario as a ‘baseline’ or ‘reference’, check which other scenarios are comparable (based on user-specified fields and model type; see General helper functions), and get the difference in a user-specified outcome between the baseline and all other comparable scenarios.

An example is a comparison of multiple final_size() scenarios with varying susceptibility matrices reflecting different assumptions about population immunity to an infection (due to vaccination or prior exposure) but sharing the same underlying population demographic structure and contact patterns, and sharing infection parameters.

Attributes

  • Data: A list of scenario class objects representing epidemic simulation outcomes (see above). This list should ideally be informatively named. scenario objects can exist as function-and-parameter specifications without the simulation being run (see above).
  • Baseline outcome tag: Which object in the data list is the baseline outcome. Can be a string or integer, with strings preferred.
  • Scenario matching tags: Named list that specifies which attributes are used to check whether scenarios are comparable. Should have at least one element, the function used to run the epidemic simulation, to avoid comparing outputs from different packages.
  • Scenario comparison tags: Named list that specifies which outcomes should be compared. E.g. The proportion of individuals infected in a final_size() run.
  • Comparison grouping tags: Named list that specifies which groups should be used when making comparisons.
    The idea here is to make it possible to compare both (1) scenarios with different susceptibility groups at a relatively broad level (e.g. a final_size() scenario with a fully susceptible population, against one with multiple [reduced] susceptibility groups; in this case, the outcomes of the latter scenario should be pooled across groups), but also (2) scenarios with similar groups, at the group level.
  • Comparison summary: A summary of the comparison, obtained after setting a baseline and comparing scenarios . 

Functions

  • Constructor: Initialise a new comparisons object, with a list of scenario objects. Optionally, specify a baseline. 
  • Print (and Summary): Print the contents of the object to screen. Prints a summary of the most important details. If a baseline is set, prints the identity and a summary of the baseline scenario. If the comparison summary has been generated, prints the comparison summary as well; if not, a warning that the comparison has not yet been calculated.
  • Make comparisons: Match comparable scenarios and compare against the baseline. Comparisons could be at the level of summary statistics. This function should be able to check whether the scenarios data list has scenario objects with the data available, or whether the scenario data itself needs to be generated by calling the scenario function (see above). If the latter, the function should generate the scenario data before making the comparison. The output of this function is one or more data structures (lists or data frames, or a named list and a data.frame) with differences between each scenario and the baseline.
    For discussion: How to correctly propagate uncertainty in epidemic simulation outcomes when getting the difference between each scenario and the baseline. Options include a simple difference of central tendency, a comparison of the variance in outcomes, or a measure of the overlap between the 95% CIs of any pair of scenarios (where one is the baseline). 
  • Access functions: Simple functions to access the scenario data or comparisons data, and/or other object attributes.

Design of the `scenario` class

This isssue is an ongoing discussion about the design of the {scenario} class.

A scenario is intended to be an S3 class object that holds the specifications of an epidemic simulation run, i.e., the function and its arguments (simulation parameters), and optionally, the data obtained from executing these simulation runs. S3 rather than S4 or R6 because it is somewhat better documented and/or more widely used or does not require a dependency (R6).

Attributes

  • Simulation function: Which function returns the scenario output, e.g. final_size() from finalsize.
  • Parameters: A list of simulation parameters stored as a named list, e.g. the parameters passed to a single run of finalsize::final_size().
  • Replicates: The number of replicates of a simulation to run, using the function specified as the simulation function, with parameters specified above. Specifying more than one replicate only makes sense for stochastic simulations where epidemic outputs vary in each replicate.
    For analytical models such as final_size(), which usually converge on the same value in each run, there will need to be a way to specify that one or more simulation parameters should be drawn from distributions (e.g. R0 or a social contact matrix); alternatively, each draw of a parameter from a distribution could be a single replicate of a unique scenario. 
  • Data availability tag: A boolean tag (TRUE/FALSE) that indicates whether the scenario object has any simulation output data. The idea is to allow scenarios to exist as simulation run specifications without data (i.e., an intent to run N replicates of this epidemic simulation with these parameters). This avoids using memory and processing time until required, such as at the comparison stage. This tag should be updated after any epidemic simulations are run (see Methods/Functions). The idea is for all replicates to be run simultaneously, and this could be parallelised to improve speed.
    For discussion: Whether it should be possible to remove data after extracting summary statistics, to save working memory. 
  • Data list: A named list of epidemic scenario outputs, e.g., a list of outputs from final_size(), or from future epidemics functions [placeholder name epi_demic()]. List names are added to make list indexing easier to understand; I.e., it is easier to see what a function is doing when it selects an object as data[[“finalsize_UK_full_susceptibility”]], rather than data[[1]].
  • Summary statistics: A named list of summary statistics on the epidemic simulation outcomes, e.g., the mean and 95% CI of the final sizes of an epidemic by age group.
    For discussion: Which summary statistics to return, and how to represent them. Note: See also how epiparameter represents delay distributions (drawing on implementation in distributional). 

Methods/functions

  • Constructor: Initialise a new epidemic scenario, with a function name and parameter list. Data are not initially prepared. Alternatively, convert a list of data objects, a parameter list, and a function name into a scenario object whose data preparation tag is set to TRUE. 
  • Print (and Summary): Print a representation of the scenario object to screen. This should include important details including the function that was used to run the epidemic simulation, the parameter list (truncated for readability if necessary), the number of replicates, and the data availability tag.
  • Run scenario: A function to populate the scenario data list with output from N replicates of the specified function, using the parameter list. Calls e.g. final_size() or in future, epi_demic().
  • Summarise scenario: A function to get summary statistics from the N simulation outputs and populate the summary statistics field.
  • Access functions: Functions to access class elements, allowing users to avoid accessing them directly (e.g. using scenario$…).

Add name slot to scenario class

This issue is to request that the scenario class should have an optional 'scenario_name' slot, and that the constructor should have an option to pass the name. This will require modifying how comparison class objects look for the baseline scenario among the scenarios they contain, as well as other helper functions (such as sce_get_outcomes()) that require scenario names. There will also have to be considerations of how to handle scenarios that are not named, when returning their data.

Mark scenarios development temporarily suspended

This issue is to mark {scenarios} repo status as "suspended".

Context:

  • {scenarios} was initially expected to help users run and compare outputs from epidemic scenario models, such as from the upcoming package {epidemics};
  • {epidemics} development is swiftly moving in a direction where such comparison functionality may be included in the package itself;
  • The use case for {scenarios} is now not as strong;
  • Further development on {epidemics}, user feedback, or identification of specific use cases may lead to a resumption of {scenarios} development.

Related PR to remove the hex logo from the main website: epiverse-trace/epiverse-trace.github.io#187

Design of helper functions

This issue is intended as a design discussion for helper functions in {scenarios}.

These functions are intended to be used within comparisons objects, but could also be made available to users for more ad hoc use. 

  • are_comparable(): Function that takes two or more scenario class objects as input and checks which pair-wise comparisons are possible. This depends on whether the same function was called, and whether the parameter list has the same elements (throw warnings if the parameter list is identical).
  • Should check whether some user specified parameters are identical among scenarios (e.g. if comparing different R0 in the same population, do not allow comparison where demography is different; such as when demographic groups have different age limits).  Could be used for only two objects, in which case it should return a single boolean; otherwise, a boolean matrix of whether any pair of scenarios are comparable. 
  • select_comparable(): Function that takes two or more scenario class objects as input, as well as an optional string or integer specifying which should be considered the ‘baseline’ object. Runs are_comparable() and returns only those scenarios which are comparable as a list; the ‘baseline’ object is returned as the first element of the list. If no ‘baseline’ is specified, the first object is assumed to be the baseline with a warning or message to the user. 
  • make_scenario_names(): Function to make informative names for scenario objects, based on their parameter combinations, and/or other user-specified name components. 
  • read_scenario() / write_scenario(): Functions to read in and save a scenario specification from or to a file (structured Excel, JSON, YAML). 
  • More helper functions as required.

Structure Excel import-export for scenario objects

To share or reproduce scenarios, it is important to be able to save scenario objects widely used format. MS Excel is a widely used paradigm for tabular data and some modelling, and it should be possible to read and write scenario specifications and data from structured Excel sheets. This feature request is to add functions that allow reading and writing scenarios to and from Excel documents.

Scenario should inherit from data.table or data.frame

This issue is to suggest that scenario objects should inherit from data.table or data.frame. This would have benefits, including:

  1. Allow the use of methods for data.frames and data.tables, allowing users to easily build on top of scenarios, or to join and manipulate scenarios in ways that they are already accustomed to doing in their workflows with tabular data;
  2. Put the data aspect of the scenario front and centre so that users know that scenario objects are essentially model outputs (in the form of data.frames), with some attached parameters.
    Other benefits could be added to comments on this issue.

Add a 'Get started' vignette

This issue is to reques the addition of a basic vignette that showcases the creation of scenario objects, accessing parameters, running scenarios and getting data, checking for comparability, and finally combining scenarios into a comparison object.

This basic vignette should also have text that explains the use cases for these operations with realistic examples.

Store scenario seed

This issue to discuss whether, and potentially to request that, scenario objects should store the seed used to run epidemic simulations. This will help the reproducibility of stochastic models.

Function to filter comparable scenarios in a comparison

This issue to request a function, provisionally called sce_filter_comparable, that filters the scenario objects in a comparison object based on whether they can be compared with the baseline object of the comparison.

This function would rely on the sce_are_comparable function for pair-wise comparisons of each scenario against the baseline, while passing the arguments match_variables, comparison_variables, and expect_identical_match.

The function should return a comparison object with the baseline and any other comparable scenarios, and print a warning message when there are no comparable scenarios.

JSON import-export for scenario objects

To share or reproduce scenarios, it is important to be able to save scenario objects in a lightweight and widely usable format such as JSON. This feature request is to add functions that allow saving a scenario to a JSON file, and creating a scenario from a JSON file, with or without model data already generated.

YAML import-export for scenario objects

To share or reproduce scenarios, it is important to be able to save scenario objects in a lightweight and widely usable format such as YAML. This feature request is to add functions that allow saving a scenario to a YAML file, and creating a scenario from a YAML file, with or without model data already generated.

Compact documentation

This issue is to request that the documentation of related functions in {scenarios} be compacted into a single help page, using @rdnames or @inheritParams where suitable. An example is the pair sce_get_baseline() and sce_set_baseline().

Function to drop scenario data

This issue is to add a function, potentially called sce_drop_data() that allows thedata field of a scenario to be dropped safely without accessing the class members.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.