Code Monkey home page Code Monkey logo

gettbinr's Introduction

getTBinR: Access and Summarise World Health Organization Tuberculosis Data

badge CRAN_Release_Badge develVersion DOI DOI metacran monthly downloads metacran downloads

Quickly and easily import analysis ready Tuberculosis (TB) burden data, from the World Health Organization (WHO), into R. The aim of getTBinR is to allow researchers, and other interested individuals, to quickly and easily gain access to a detailed TB data set and to start using it to derive key insights. It provides a consistent set of tools that can be used to rapidly evaluate hypotheses on a widely used data set before they are explored further using more complex methods or more detailed data. These tools include: generic plotting and mapping functions; a data dictionary search tool; an interactive shiny dashboard; and an automated, country level, TB report. For newer R users, this package reduces the barrier to entry by handling data import, munging, and visualisation. All plotting and mapping functions are built with ggplot2 so can be readily extended. See here for the WHO data permissions. For help getting started see the Getting Started vignette and for a case study using the package see the Exploring Global Trends in Tuberculosis Incidence Rates vignette.

Installation

Install the CRAN version:

install.packages("getTBinR")

Alternatively install the development version from GitHub:

# install.packages("devtools")
devtools::install_github("seabbs/getTBinR")

Documentation

Documentation Development documentation Getting started Functions

Testing

Travis-CI Build Status AppVeyor Build Status Coverage Status

Quick start

Lets get started quickly by mapping and then plotting TB incidence rates in the United Kingdom. First map the most recently available global TB incidence rates (this will also download and save both the TB burden data and its data dictionary, if they are not found locally, to R’s temporary directory),

getTBinR::map_tb_burden(metric = "e_inc_100k")

Then compare TB incidence rates in the UK to TB incidence rates in other countries in the region,

getTBinR::plot_tb_burden_overview(metric = "e_inc_100k",
                                  countries = "United Kingdom",
                                  compare_to_region = TRUE)

In order to compare the changes in incidence rates over time, in the region, plot the annual percentage change,

getTBinR::plot_tb_burden_overview(metric = "e_inc_100k",
                                  countries = "United Kingdom",
                                  compare_to_region = TRUE,
                                  annual_change = TRUE)

Now plot TB incidence rates over time in the United Kingdom, compared to TB incidence rates in Europe and globally.

getTBinR::plot_tb_burden_summary(metric = "e_inc_num",
                                 metric_label = "e_inc_100k",
                                 countries = "United Kingdom",
                                 compare_all_regions = FALSE,
                                 compare_to_region = TRUE,
                                 compare_to_world = TRUE)

We can repeat the above plot but this time only for the UK - this allows us to get a clear picture of trends in TB incidence rates in the UK.

getTBinR::plot_tb_burden(metric = "e_inc_100k",
                         countries = "United Kingdom")

We might be interested in having some of this information in tablular form. We can either generate a short summary for the most recent year of available data with the following,

getTBinR::summarise_metric(metric = "e_inc_100k",
                           countries = "United Kingdom")
#> # A tibble: 1 x 6
#>   country         year metric        world_rank region_rank avg_change
#>   <chr>          <int> <chr>              <int>       <int> <chr>     
#> 1 United Kingdom  2018 8 (7.2 - 8.8)        165          33 -5.9%

Or a more detailed dataset as follows,

getTBinR::summarise_tb_burden(metric = "e_inc_num",
                              stat = "rate",
                              countries = "United Kingdom", 
                              compare_to_world = FALSE, 
                              compare_to_region = FALSE) 
#> # A tibble: 133 x 5
#>    area            year e_inc_num e_inc_num_lo e_inc_num_hi
#>    <fct>          <int>     <dbl>        <dbl>        <dbl>
#>  1 United Kingdom  2000      11.9         10.7         13.1
#>  2 United Kingdom  2001      11.5         10.3         12.7
#>  3 United Kingdom  2002      13.1         11.8         14.3
#>  4 United Kingdom  2003      13.4         12.1         14.8
#>  5 United Kingdom  2004      13.2         11.9         14.5
#>  6 United Kingdom  2005      15.3         13.8         16.6
#>  7 United Kingdom  2006      15.3         13.8         16.4
#>  8 United Kingdom  2007      14.6         13.2         16.1
#>  9 United Kingdom  2008      15.0         13.5         16.1
#> 10 United Kingdom  2009      14.5         13.1         15.9
#> # … with 123 more rows

Here e_inc_num is used rather than e_inc_100k as incidence rates are being estimated based on notified cases. This allows country level rates to be compared to regional (using compare_to_region = TRUE) and global (using compare_to_world = TRUE) rates.

See Functions for more details of the functions used (note the fuzzy country matching, all functions will try to exactly match your country request and if that fails will search for partial matches) and for more package functionality. We could make the plots above interactive by specifying interactive = TRUE

Additional datasets

On top of the core datasets provided by default, getTBinR also supports importing multiple other datasets. These include data on latent TB, HIV surveillance, intervention budgets, and outcomes. The currently supported datasets are listed below,

knitr::kable(getTBinR::available_datasets[, 1:4])
dataset description timespan default
Estimates Generated estimates of TB mortality, incidence, case fatality ratio, and treatment coverage (previously called case detection rate). Data available split by HIV status. 2000-2018 yes
Estimates Generated estimates for the proportion of TB cases that have rifampicin-resistant TB (RR-TB, which includes cases with multidrug-resistant TB, MDR-TB), RR/MDR-TB among notified pulmonary TB cases. 2018 yes
Incidence by age and sex Generated estimates of TB incidence stratified by age and sex. This dataset is currently experimental. 2018 no
Latent TB infection Generated estimates incidence of latent TB stratified by age. 2018 no
Notification TB notification dataset linking to TB notifications as raw numbers. Age-stratified, with good data dictionary coverage but has large amounts of missing data. 1980-2018 no
Drug resistance surveillance Country level drug resistance surveillance. Lists drug resistance data from country level reporting. Good data dictionary coverage but has large amounts of missing data. 2018 no
Non-routine HIV surveillance Country level, non-routine HIV surveillance data. Good data dictionary coverage but with a large amount of missing data. 2007-2018 no
Outcomes Country level TB outcomes data. Lists numeric outcome data, very messy but with good data dictionary coverage. 1994-2018 no
Budget Current year TB intervention budgets per country. Many of the data fields are cryptic but has good data dictionary coverage. 2018 no
Expenditure and utilisation Previous year expenditure on TB interventions. Highly detailed, with good data dictionary coverage but lots of missing data. 2018 no
Policies and services Lists TB policies that have been implemented per country. Highly detailed, with good data dictionary coverage but lots of missing data. 2018 no
Community engagement Lists community engagement programmes. Highly detailed, with good data dictionary coverage but lots of missing data. 2013-2018 no
Laboratories Country specific laboratory data. Highly detailed, with good data dictionary coverage but lots of missing data. 2009-2018 no

These datasets can be imported into R by supplying the name of the required dataset to the additional_datasets argument of get_tb_burden (or any of the various plotting/summary functions). Alternatively, they can all be imported in one go using additional_datasets = "all", as below,

getTBinR::get_tb_burden(additional_datasets = "all")
#> # A tibble: 8,694 x 485
#>    country iso2  iso3  iso_numeric g_whoregion  year e_pop_num e_inc_100k
#>    <chr>   <chr> <chr>       <int> <chr>       <int>     <int>      <dbl>
#>  1 Afghan… AF    AFG             4 Eastern Me…  2000  20779953        190
#>  2 Afghan… AF    AFG             4 Eastern Me…  2001  21606988        189
#>  3 Afghan… AF    AFG             4 Eastern Me…  2002  22600770        189
#>  4 Afghan… AF    AFG             4 Eastern Me…  2003  23680871        189
#>  5 Afghan… AF    AFG             4 Eastern Me…  2004  24726684        189
#>  6 Afghan… AF    AFG             4 Eastern Me…  2005  25654277        189
#>  7 Afghan… AF    AFG             4 Eastern Me…  2006  26433049        189
#>  8 Afghan… AF    AFG             4 Eastern Me…  2007  27100536        189
#>  9 Afghan… AF    AFG             4 Eastern Me…  2008  27722276        189
#> 10 Afghan… AF    AFG             4 Eastern Me…  2009  28394813        189
#> # … with 8,684 more rows, and 477 more variables: e_inc_100k_lo <dbl>,
#> #   e_inc_100k_hi <dbl>, e_inc_num <int>, e_inc_num_lo <int>,
#> #   e_inc_num_hi <int>, e_tbhiv_prct <dbl>, e_tbhiv_prct_lo <dbl>,
#> #   e_tbhiv_prct_hi <dbl>, e_inc_tbhiv_100k <dbl>, e_inc_tbhiv_100k_lo <dbl>,
#> #   e_inc_tbhiv_100k_hi <dbl>, e_inc_tbhiv_num <int>, e_inc_tbhiv_num_lo <int>,
#> #   e_inc_tbhiv_num_hi <int>, e_mort_exc_tbhiv_100k <dbl>,
#> #   e_mort_exc_tbhiv_100k_lo <dbl>, e_mort_exc_tbhiv_100k_hi <dbl>,
#> #   e_mort_exc_tbhiv_num <int>, e_mort_exc_tbhiv_num_lo <int>,
#> #   e_mort_exc_tbhiv_num_hi <int>, e_mort_tbhiv_100k <dbl>,
#> #   e_mort_tbhiv_100k_lo <dbl>, e_mort_tbhiv_100k_hi <dbl>,
#> #   e_mort_tbhiv_num <int>, e_mort_tbhiv_num_lo <int>,
#> #   e_mort_tbhiv_num_hi <int>, e_mort_100k <dbl>, e_mort_100k_lo <dbl>,
#> #   e_mort_100k_hi <dbl>, e_mort_num <int>, e_mort_num_lo <int>,
#> #   e_mort_num_hi <int>, cfr <dbl>, cfr_lo <dbl>, cfr_hi <dbl>, cfr_pct <int>,
#> #   cfr_pct_lo <int>, cfr_pct_hi <int>, c_newinc_100k <dbl>, c_cdr <dbl>,
#> #   c_cdr_lo <dbl>, c_cdr_hi <dbl>, source_rr_new <chr>,
#> #   source_drs_coverage_new <chr>, source_drs_year_new <int>,
#> #   e_rr_pct_new <dbl>, e_rr_pct_new_lo <dbl>, e_rr_pct_new_hi <dbl>,
#> #   e_mdr_pct_rr_new <int>, source_rr_ret <chr>, source_drs_coverage_ret <chr>,
#> #   source_drs_year_ret <int>, e_rr_pct_ret <dbl>, e_rr_pct_ret_lo <dbl>,
#> #   e_rr_pct_ret_hi <dbl>, e_mdr_pct_rr_ret <int>, e_inc_rr_num <int>,
#> #   e_inc_rr_num_lo <int>, e_inc_rr_num_hi <int>, e_mdr_pct_rr <dbl>,
#> #   e_rr_in_notified_labconf_pulm <int>,
#> #   e_rr_in_notified_labconf_pulm_lo <int>,
#> #   e_rr_in_notified_labconf_pulm_hi <int>, source_hh <chr>, e_hh_size <dbl>,
#> #   prevtx_data_available <int>, newinc_con04_prevtx <int>,
#> #   ptsurvey_newinc <int>, ptsurvey_newinc_con04_prevtx <int>,
#> #   e_prevtx_eligible <dbl>, e_prevtx_eligible_lo <dbl>,
#> #   e_prevtx_eligible_hi <dbl>, e_prevtx_kids_pct <dbl>,
#> #   e_prevtx_kids_pct_lo <dbl>, e_prevtx_kids_pct_hi <dbl>, new_sp <int>,
#> #   new_sn <int>, new_su <int>, new_ep <int>, new_oth <int>, ret_rel <int>,
#> #   ret_taf <int>, ret_tad <int>, ret_oth <int>, newret_oth <int>,
#> #   new_labconf <int>, new_clindx <int>, ret_rel_labconf <int>,
#> #   ret_rel_clindx <int>, ret_rel_ep <int>, ret_nrel <int>,
#> #   notif_foreign <int>, c_newinc <int>, new_sp_m04 <int>, new_sp_m514 <int>,
#> #   new_sp_m014 <int>, new_sp_m1524 <int>, new_sp_m2534 <int>,
#> #   new_sp_m3544 <int>, new_sp_m4554 <int>, …

Once imported, these datasets can be used in the plotting and summary functions provided by getTBinR (by passing them to their df argument or using the additional_datasets argument in each function). See the contributing section if their are any other datasets that you think getTBinR should support or if you have suggestions for better descriptions for each dataset.

WHO-inspired themes and palettes.

The WHO makes use of several standardised plot themes and colour palettes. getTBinR implements these so that the package can be easily used internally at the WHO or by those collaborating with the WHO.

getTBinR::plot_tb_burden_summary(countries = "United Kingdom", 
                                 compare_all_regions = FALSE, 
                                 compare_to_region = TRUE) +
  getTBinR::theme_who() +
  getTBinR::scale_colour_who(reverse = TRUE) +
  getTBinR::scale_fill_who(reverse = TRUE)

Shiny dashboard

To explore the package functionality in an interactive session, or to investigate TB without having to code extensively in R, a shiny dashboard has been built into the package. This can either be used locally using,

getTBinR::run_tb_dashboard()

Or accessed online. Any metric in the WHO data can be explored, with country selection using the built in map, and animation possible by year.

Snapshot of the integrated dashboard.

Country report

To get a detailed overview of TB in a country of your choice run the following, alternatively available from the built in dashboard above.

## Code saves report into your current working directory
render_country_report(country = "United Kingdom", save_dir = ".")

Example report for the United Kingdom.

Contributing

File an issue here if there is a feature, or a dataset, that you think is missing from the package, or better yet submit a pull request!

Please note that the getTBinR project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Citing

If using getTBinR please consider citing the package in the relevant work. Citation information can be generated in R using the following (after installing the package),

citation("getTBinR")
#> 
#> To cite getTBinR in publications use:
#> 
#>   Sam Abbott (2019). getTBinR: an R package for accessing and
#>   summarising the World Health Organisation Tuberculosis data Journal
#>   of Open Source Software, 4(34), 1260. doi: 10.21105/joss.01260
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Article{,
#>     title = {getTBinR: an R package for accessing and summarising the World Health Organisation Tuberculosis data},
#>     author = {Sam Abbott},
#>     journal = {Journal of Open Source Software},
#>     year = {2019},
#>     volume = {4},
#>     number = {34},
#>     pages = {1260},
#>     doi = {10.21105/joss.01260},
#>   }

Docker

This package has been developed in docker based on the rocker/tidyverse image, to access the development environment enter the following at the command line (with an active docker daemon running),

docker pull seabbs/gettbinr
docker run -d -p 8787:8787 -e USER=getTBinR -e PASSWORD=getTBinR --name getTBinR seabbs/gettbinr

The rstudio client can be accessed on port 8787 at localhost (or your machines ip). The default username is getTBinR and the default password is getTBinR. Alternatively, access the development environment via binder.

gettbinr's People

Contributors

arfon avatar mariabnd avatar seabbs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gettbinr's Issues

Rmarkdown not installed by run_tb_dashboard but needed to render the TB report.

Describe the bug
When rendering a TB country report via the shiny app an error occurs in a clean R install.

To Reproduce

getTBinR::run_tb_dashboard()

Select a country and click country report.

Expected behavior

Report should be generated and downloaded.

Additional context
render_country_report should also check for Rmarkdown.

Data import error

Dear getTBinR gurus,

this is a great application for anyone interested in tuberculosis data. Sadly enough, however, I came across a problem. After installation according to the vignette, trying to use

tb_burden <- get_tb_burden()

leads to apparant downloading of the data, they are not read in correctly, however. The screen log is:

tb_burden <- get_tb_burden()
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 272k 0 272k 0 0 192k 0 --:--:-- 0:00:01 --:--:-- 194k
Downloading the data using fread::data.table has failed. Trying
again using utils::read.csv
Downloading data has failed after 1 tries.
Attempting data download in 3.4 seconds.
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
Downloading the data using fread::data.table has failed. Trying
again using utils::read.csv
Downloading data has failed after 2 tries.
Attempting data download in 2.9 seconds.
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
Downloading the data using fread::data.table has failed. Trying
again using utils::read.csv
Downloading data has failed after 3 tries.
Attempting data download in 4.4 seconds.
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
Downloading the data using fread::data.table has failed. Trying
again using utils::read.csv
Downloading data has failed after 4 tries.
Attempting data download in 2.5 seconds.
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
Downloading the data using fread::data.table has failed. Trying
again using utils::read.csv
Downloading data has failed after 5 tries.
Attempting data download in 1.2 seconds.
Fehler in get_data(url = url, download_data = download_data, data_trans_fn = trans_burden_data, :
Data downloading has failed, check your internet connection.
If this issue is not resolved, contact the package author.
Zusätzlich: Warnmeldungen:
1: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
URL 'https://extranet.who.int/tme/generateCSV.asp?ds=estimates': status was 'Failure when receiving data from the peer'
2: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
URL 'https://extranet.who.int/tme/generateCSV.asp?ds=estimates': status was 'Failure when receiving data from the peer'
3: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
URL 'https://extranet.who.int/tme/generateCSV.asp?ds=estimates': status was 'Failure when receiving data from the peer'
4: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
URL 'https://extranet.who.int/tme/generateCSV.asp?ds=estimates': status was 'Failure when receiving data from the peer'
5: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
URL 'https://extranet.who.int/tme/generateCSV.asp?ds=estimates': status was 'Failure when receiving data from the peer'

Using wget https://extranet.who.int/tme/generateCSV.asp?ds=estimates , the data are read into a file on my local machine (GNU/Kubuntu Linux 16.04.3 LTS running Linux 4.15.5, R 3.4.3) like a charm.

I'd love a clue on how I should proceed to have the data read in properly.

Best wishes and regards,

Ernst

Review available data

The WHO TB report appears to draw on additional cost information. It makes sense to integrate this into the package.

Steps are:

  • Review WHO TB Report and note data being used
  • Find data sources
  • Incorporate into get_tb_burden as for MDR data
  • Add to examples
  • Add tests for new data importing
  • Add details to documentation
  • Add details to news
  • Review if any new feautures

Return correct confidence intervals when supplying countries to `summarise_tb_burden`

Describe the bug
When computing summary statistics using summarise_tb_burden and supplying a list of countries NA are returned rather than the known data values. This issue is only valid for the "mean". For "median" confidence intervals are not possible so NA values make sense.

To Reproduce

library(getTBinR)

summarise_tb_burden(countries = "United Kingdom", 
                    metric = "e_mdr_pct_rr_new", verbose = FALSE,
                    year = 2017, stat = "mean") 

Expected behaviour
The expected behaviour is that the NA values are returned for all single countries. Where it makes sense actual country level confidence intervals should be returned.

Desktop (please complete the following information):
Using development docker container (seabbs/gettbinr)

Expand the feature set of the shiny app.

The built-in shiny app does not currently support any of the features released since 0.5.0 or the new datasets supported from 0.6.0. These should be included prior to a new release.

Submit to CRAN

  • Solve all other issues
  • Add basic functionality and test extensively

Vignette: Case study

Case study using the full functionality of the package. Possible ideas are:

  • High incidence countries. Similarities/difference s.
  • United kingdom. Epidemiology in comparison to other countries.
  • TB in Europe.

CRAN check failed

  • Issues downloading the data still cause issues for CRAN check

Options

  1. Skip tests and examples that rely on data on CRAN. This is all tests and examples.
  2. Provide a static copy of current data, checking the WHO permissions this appears to be allowed.

curl::curl_download Error

Data fails to download on Windows, as of the 22/01/18

Error is:

>tb_burden <- get_data(url = "https://extranet.who.int/tme/generateCSV.asp?ds=estimates",
+ save_name = "TB_burden",
+ save = TRUE, 
+ download_data = TRUE)
Downloading data from: https://extranet.who.int/tme/generateCSV.asp?ds=estimates
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 19471    0 19471    0     0  19471      0 --:--:--  0:00:01 --:--:-- 10046
100  272k    0  272k    0     0   136k      0 --:--:--  0:00:02 --:--:--  106k
Error in curl::curl_download(input, tt, mode = "wb", quiet = !showProgress) : 
  Failure when receiving data from the peer
Calls: get_data -> <Anonymous> -> <Anonymous> -> .Call
Execution halted

Implement a new `metric_summary` function

The current TB report uses an internal function to summarise a given metric for the country of interest. This functionality should be expanded and added to the general package functionality.

Steps are:

  • Implement skeleton function
  • Implement basic examples
  • Use package function in TB report
  • Add comprehensive tests
  • Add additional features
  • Add to package examples
  • Add to TB dashboard

Plot regional comparision

Make a regional comparison plot for any given metric. Leverage #34 to summarise data for each region. The final plot should look like these.

Plot options:

  • Facet, colour palette, axis scales etc....

Summarise statistic across region/world/user defined group of countries

For a given statistic provide a summary measure across multiple countries (this needs to be specified across regions, or be user-defined). The summary measure should include the appropriate uncertainty.

It also needs to be able to handle measures that have no uncertainty, and measure that should be calculated based on a weighting (i.e incidence rates)

Forthcoming release of ggplot2 and getTBinR

We are contacting you because you are the maintainer of getTBinR, which imports ggplot2 and uses vdiffr to manage visual test cases. The upcoming release of ggplot2 includes several improvements to plot rendering, including the ability to specify lineend and linejoin in geom_rect() and geom_tile(), and improved rendering of text. These improvements will result in subtle changes to your vdiffr dopplegangers when the new version is released.

Because vdiffr test cases do not run on CRAN by default, your CRAN checks will still pass. However, we suggest updating your visual test cases with the new version of ggplot2 as soon as possible to avoid confusion. You can install the development version of ggplot2 using remotes::install_github("tidyverse/ggplot2").

If you have any questions, let me know!

Make custom name override data set country name

Is your feature request related to a problem? Please describe.
When supplying a custom country name (i.e United Kingdom) the current default is to override this with the fuzzy matched country name in the data. This has two negative consequences.

  1. User name is not respected. The default should be to do what the user wants.

  2. Country names in the data can be very long and/or archaic. Supplying a custom name is a good way to override this.

Describe the solution you'd like
Switch naming priority with the user name respected.

Improve test coverage

Is your feature request related to a problem? Please describe.
Test coverage has degraded with recent package versions.

Describe the solution you'd like
Get coverage over 95% again.

Additional context
Main issues are the changes to summarise_tb_burden and any additional tests required by new functionality

Add "sum" statistic to `summarise_tb_burden`

Is your feature request related to a problem? Please describe.
When looking at the data it would be useful to be able to summarise some metrics (i.e notifications) with a sum.

Describe the solution you'd like
Add a stat = "sum" option to summarise_tb_burden and feed through to plot_tb_burden_summary

Fix failing tests

Tests are failing on both travis and appveyor - evaluate why this is happening and fix

Resolve dashboard plot clipping

In the current implementation of run_tb_dashboard box sizes are static and plotly plots clip these box limits. This is related to #54 and both issues need to be resolved at the same time.

Options are:

  1. Fix current plotly plots to be correctly bounded in the current static implementation
  2. Move to static plots
  3. Move to another plotting package (i.e highcharter)

Note: Adding another package means code duplication.

Shiny dashboard for package

Map based showing current incidence rate. When a country is selected show summary plots for the country and offer a downloadable report option. For country reports include incidence rates in comparison to the region, incidence rates over time. Mortality overtime etc.

Fix budget dataset

Describe the bug
The budget dataset only contains data for the following year but is joined to data that contains notifications up to the current year. This means that all budget data is excluded by default

To Reproduce

library(getTBinR)

tb <- get_tb_burden(additional_datasets = "Budget")

summary(tb)

Expected behavior
Need to either backdate the budget data or allow for future dates when joining

Additional data sources

Need list of possible additional data sources. These could include more detailed individual country level datasets, data on vaccination, data on treatment etc.

Move to using ggplot2 in package best practises

Based on this post the following best practises need to be implemented.

  • Don't fully import ggplot2, instead import only used functions
  • Switch to using vars(.data[[col]]) rather than aes_string.
  • Make an S3 plot and summary method for TB data (using autoplot).

Update to 2017.

  • Update all defaults and docs to 2017.
  • Make robust to future data updates
  • Update dev version with changes
  • Push to CRAN after tests
  • Blog post advertising updates and highlighting key changes year on year.

Incomplete LICENSE file

GPL-3 is an appropriate OSI-approved license for a JOSS submission. However, the bottom of the LICENSE file in this repo is unfilled (note the use of angled brackets near the bottom, e.g. , ).

Part of this JOSS review.

Stop `search_data_dict` from returning an Error when nothing found

Is your feature request related to a problem? Please describe.
When search_data_dict finds nothing currently an error is returned. This is not ideal as it can cause downstream functions to fail if a variable has no entry in the data dictionary. The default should be to fail but return the base metric as the label in these cases.

Describe the solution you'd like
Change search_data_dict to return something other than an error when nothing is found.

Describe alternatives you've considered
Change all use cases to cope better when search_data_dict fails

Additional context
Reproduce with the following:

library(getTBinR)
search_data_dict('g4gt23q2')

Fix issue with assigning custom label to legend.

When adding custom label to legend there appears to be a failure when using non standard variable names. This may be resolved by changing the legend title rather than adding the label as a new variable name

plot country to region/world/user selected countries

Plot a comparison of a metric in a given country to that in the region/world/user selected countries.

  • The function needs to accept a single country, logical for the region, list of countries, logical for the world.
  • Should then estimate the summary for each of these using #34
  • Bind together into a single tibble giving appropriate names (list of countries needs to have a user-settable label)
  • Plot metric stratifying by colour for country, region, world, list of countries.
  • Provide options for facetting, axis scaling, showing legend etc.

Improve dashboard performance

The current implementation of the run_tb_dashboard function is feature complete but does not run smoothly on low compute servers. This negatively impacts the user experience.

The majority of the bottleneck appears to be in the generation of plotly interactive plots. There are 3 options for resolving this.

  1. Switch to non-interactive plots
  2. Switch to base plotly interactive plots
  3. Switch to another htmlwidgets package - like highcharter

The downside of all options bar 1 is that this would require using code not included in getTBinR or duplicating functionality that is already present.

Pediatric MDR-TB data

Hello Sam,

is there anyway to access data on MDR-TB stratified by age? I would be interested in visualizing MDR-TB cases only in children aged 0-14 years (as WHO routinely reports).

Thank you!

Best,
Matthias

[JOSS] Statement of Need

Minor issue:
This issue is related to openjournals/joss-reviews#1260

Please add a sentence to the readme that emphasizes the need of this package (e.g., there's no other package, or there's no straightforward way to get the data w/o getTBinR, ...).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.