epiforecasts / covid-rt-estimates Goto Github PK
View Code? Open in Web Editor NEWNational and subnational estimates of the time-varying reproduction number for Covid-19
Home Page: https://epiforecasts.io/covid/
License: MIT License
National and subnational estimates of the time-varying reproduction number for Covid-19
Home Page: https://epiforecasts.io/covid/
License: MIT License
This sits along side the publication task to remove the need to mount things into the container - we will only need to extract logs.
Seems stuck on the UK
Not sure what is driving this but may be a RAM issue (i.e RAM not being released). As this is flagged as the primary tool for running these estimates it would be sensible to either note that there is an issue or point at a more robust approach (i.e as in the infra repo running singularly by region) so that others can easily reproduce.
Potentially this may actually be pointing at a more serious issue though have not been able to diagnose one.
Hi I'm from Iraq
Good Day.
Every time I operate the same source of data using the excel the equation gives me different estimation and forecasting results
Thanks
Sabri
Due to computational and storage constraints, it is not feasible to run the complete time series every day with our current resources. For this reason, we have shifted our daily updates to focus on a rolling window of the last 3 months of data. Many users may be interested in the complete time series - please respond to this issue in order for us to assess the priority of this requirement.
We are considering two solutions to this issue:
@kathsherratt could you flag the old depreciated UK regions that we should remove from the repository. At the moment they are shown on the website and in all downstream CSVs.
@joeHickson I think the best solution to this is to delete the folders for these regions in order to stop them appearing in the summaries.
I am seeing the following:
INFO [2020-09-10 12:46:07] Data has not been updated since last run. If wanting to run again then remove /home/rstudio/covid-rt-estimates/last-update/colombia.rds
ERROR [2020-09-10 12:46:07] █
ERROR [2020-09-10 12:46:07] 1. ├─base::tryCatch(...)
ERROR [2020-09-10 12:46:07] 2. │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
ERROR [2020-09-10 12:46:07] 3. │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
ERROR [2020-09-10 12:46:07] 4. │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
ERROR [2020-09-10 12:46:07] 5. ├─base::withCallingHandlers(...)
ERROR [2020-09-10 12:46:07] 6. ├─global::run_regional_updates(regions = regions, args = args)
ERROR [2020-09-10 12:46:07] 7. │ └─global::rru_process_locations(regions, args, excludes, includes)
ERROR [2020-09-10 12:46:07] 8. │ ├─base::tryCatch(...)
ERROR [2020-09-10 12:46:07] 9. │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
ERROR [2020-09-10 12:46:07] 10. │ │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
ERROR [2020-09-10 12:46:07] 11. │ │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
ERROR [2020-09-10 12:46:07] 12. │ ├─base::withCallingHandlers(...)
ERROR [2020-09-10 12:46:07] 13. │ └─global::update_regional(...)
ERROR [2020-09-10 12:46:07] 14. └─base::.handleSimpleError(...)
ERROR [2020-09-10 12:46:07] 15. └─h(simpleError(msg, call))
ERROR [2020-09-10 12:46:07] colombia: object 'out' not found - update_regional, location, excludes[region == location$name], includes[region == location$name], args$force, args$timeout
WARN [2020-09-10 12:46:07] simpleWarning in outcome[[location$name]]$start <- start: Coercing LHS to a list
On a run with the new set up, I am currently seeing about 32Gb of RAM usage vs (I think) about 5Gb max using the previous version. This could be a false impression as I have no log of RAM usage but we should keep an eye on this to make sure it doesn't increase.
R is pretty bad at letting go of RAM within a single session so I could imagine this being a factor.
The logging is very useful but when an issue occurs that doesn't have a built-in logging message then there is no feedback as to why this may have occurred. Obviously one option would be to add logging messages for all possible issues but this will end up being essentially a rewrite of R's own warning/error system. Given this, it might be a good idea to set up console sinking to a separate logging file (https://stackoverflow.com/questions/11666086/output-error-warning-log-txt-file-when-running-r-script-under-command-line) as a secondary debug tool.
See server logs for details of the error.
I have sidestepped this for now by adding a tryCatch
call to regional_epinow
which ensures that regional_summary
always gets called.
I have been unable to reproduce this anywhere but the production server and have not been able to inspect the line number that is causing the error on the production server. Manual inspection of the code (both here and in EpiNow2
does not shed any light).
Currently built using a series of functions that are not called in a completely clear order. It would be great to streamline this a little and provides some more supporting documentation.
Due to a package mismatch between the server and code the update failed last night. @joeHickson should we clean out the last-update
folder and reschedule?
Azure requires timestamp to be the first thing on the line. I imagine other log ingestion tools won't object to this!
Set the format on the file logger to:
"~t ~l [~n.~f] : ~m" instead of the default of "[~l] [~t] [~n.~f] ~m"
Timestamp, Level, Namespace.CallingFunction, Message
Working with the summary data:
national/deaths/summary/summary_table.csv
and rt.csv
, cases_by_infection.csv
, and cases_by_report.csv
.
There are null summary values in summary_table.csv
and missing +- 20% CI estimates in the other three files. Appears to affect only Papua New Guinea.
This has caused some issues in rt_vis and RtD3 (and subsequently the epiforecasts site where RtD3 is implemented). Which are in the process of being fixed.
We will work on accounting for the possibility of null values in summary estimates but I wanted to raise the issue here in case it is some broader issue.
It looks like Papua New Guinea is currently the shortest time series (10 days) which may or may not have anything to do with the problem.
Thanks!
Looking at UK test positive estimates looks like delay to report has increased and our real time estimates are now biased downwards in some regions. @kathsherratt have you seen any data changes? May need to increase the time lag for this dataset only in order to avoid this for now. @joeHickson that should be possible now right?
Not seeing logs from inside check_for_update.
this should be deaths/regional/...
it probably also applies to cases.
Whilst it would be nice to be able to store full samples from each region it may be more practical in the short term to instead save the summarised estimates available in each summary folder. This can either be linked to #9 or be a git only implementation based on copying the summary folder (or just the csv's it contains into a dated folder for each region).
If a git based intervention then summarised results could be stored in a covid-rt-estimates-archive
repo for example in order to avoid additional history bloat and provide a repo that is easier for others to download (as it won't contain the history of samples and hence will be much smaller even with several hundred archived csv's).
Currently, the estimates are based on the latest master version of EpiNow2. As GitHub is our dev version of EpiNow2 this introduces some constraints (with the CRAN version being the true release version). It would make sense to make the versioning system for EpiNow2 more manual in this repo. I think the sensible way to do this is either target certain release tags (potentially the nicest option) or commits (easier as we don't need to add a tag on EpiNow2) but not a major feature. This can then be manually incremented and frees EpiNow2 development from having to worry about introducing breaking changes here. It also makes it easier to potentially test EpiNow2 upstreams on a regular basis before putting them in production.
This issue is a particular problem now when we are looking to make breaking changes to the interface of EpiNow2 but cannot PR into master as it will break this repository. This has led to a single large version PR building up in EpiNow2 which is not the ideal dev cycle and keeps important updates we are using elsewhere from users.
When running an update in docker on an Azure cluster only half of the available cores are used.
Cores are allocated using setup_future
in R/utils.R
. This uses future::availableCores()
internally and should default to all cores when jobs > cores when jobs < cores then the remaining cores should be shared between jobs and used to run multiple MCMC chains. Local tests indicate all of these features are working as intended outside of Azure/script use in docker.
Both cases and deaths updating is broken at the national level.
Seeing lots of NULL Warnings in logs running using the new script approach. @joeHickson is this expected?
Seeing this in all regional estimates but not in national-level estimates indicating to me its probably coming from the new covid-rt-estimates
implementation. See https://github.com/epiforecasts/covid-rt-estimates/tree/estimate-test for a partial run using settings from the master (will complete as and when estimates are done).
Add a cli flag to run-region-updates to allow a timeout length to be specified.
blocked by epiforecasts/EpiNow2#63
Linked to #62 , adding new data sources for UK estimates.
For these to be used, ideally the summary of Rt estimates needs to be output in a single csv file.
This should
type
variable to only the Rt "estimates" (not the estimates from partial data or forecasts)united-kingdom
folder.Below is example code to get to what is needed. In this example the new dedicated folder for output is called "all-summary-rt".
However I am not sure where this needs to go to get it running in the process after the daily Rt estimates have finished.
# Get Rt estimates summary from each data source
cases <- fread(here::here("subnational", "united-kingdom", "cases", "summary", "rt.csv"))
cases <- cases[cases$type == "estimate"][, data_source := "test-positive cases"]
deaths <- fread(here::here("subnational", "united-kingdom", "deaths", "summary", "rt.csv"))
deaths <- deaths[deaths$type == "estimate"][, data_source := "deaths"]
admissions <- fread(here::here("subnational", "united-kingdom", "admissions", "summary", "rt.csv"))
admissions <- admissions[admissions$type == "estimate"][, data_source := "hospital admissions"]
# Bind all data sources
uk_rt <- data.table::rbindlist(list(cases, deaths, admissions),
fill = TRUE, use.names = TRUE)
# Save back to main UK folder
write.csv(uk_rt, here::here("subnational", "united-kingdom", "all-summary-rt", paste0(Sys.Date(), "-uk-rt.csv"))
It would be great to be able to specify a subset of regions to update. This would make it easier to split processing across multiple compute nodes in a fairly simple fashion.
The new implementation of global cases and deaths fails for all scales (national and regional).
I see variants of the following error tree:
WARN [2020-09-15 00:05:56] cases: NULL - mccollect, jobs, TRUE
ERROR [2020-09-15 00:05:56] █
ERROR [2020-09-15 00:05:56] 1. ├─base::tryCatch(...)
ERROR [2020-09-15 00:05:56] 2. │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
ERROR [2020-09-15 00:05:56] 3. │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
ERROR [2020-09-15 00:05:56] 4. │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
ERROR [2020-09-15 00:05:56] 5. ├─base::withCallingHandlers(...)
ERROR [2020-09-15 00:05:56] 6. ├─global::run_regional_updates(datasets = datasets, args = args)
ERROR [2020-09-15 00:05:56] 7. │ └─global::rru_process_locations(datasets, args, excludes, includes)
ERROR [2020-09-15 00:05:56] 6. ├─global::run_regional_updates(datasets = datasets, args = args)
ERROR [2020-09-15 00:05:56] 7. │ └─global::rru_process_locations(datasets, args, excludes, includes)
ERROR [2020-09-15 00:05:56] 8. │ ├─base::tryCatch(...)
ERROR [2020-09-15 00:05:56] 9. │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
ERROR [2020-09-15 00:05:56] 10. │ │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
ERROR [2020-09-15 00:05:56] 11. │ │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
ERROR [2020-09-15 00:05:56] 12. │ ├─base::withCallingHandlers(...)
ERROR [2020-09-15 00:05:56] 13. │ └─global::update_regional(...)
ERROR [2020-09-15 00:05:56] 14. │ └─EpiNow2::regional_epinow(...)
ERROR [2020-09-15 00:05:56] 15. │ └─future.apply::future_lapply(...)
ERROR [2020-09-15 00:05:56] 16. │ └─future.apply:::future_xapply(...)
ERROR [2020-09-15 00:05:56] 17. │ └─future::future(...)
ERROR [2020-09-15 00:05:56] 18. │ └─future:::makeFuture(...)
ERROR [2020-09-15 00:05:56] 19. │ └─future:::fun(...)
ERROR [2020-09-15 00:05:56] 20. │ ├─future::run(future)
ERROR [2020-09-15 00:05:56] 21. │ └─future:::run.MulticoreFuture(future)
ERROR [2020-09-15 00:05:56] 22. │ └─future:::requestCore(...)
ERROR [2020-09-15 00:05:56] 23. │ └─future:::usedCores()
ERROR [2020-09-15 00:05:56] 24. │ └─future:::FutureRegistry(reg, action = "list", earlySignal = TRUE)
ERROR [2020-09-15 00:05:56] 25. │ └─future:::collectValues(where, futures = futures[idxs], firstOnly = FALSE)
ERROR [2020-09-15 00:05:56] 26. │ ├─future::resolved(future, run = FALSE)
ERROR [2020-09-15 00:05:56] 27. │ └─future:::resolved.MulticoreFuture(future, run = FALSE)
ERROR [2020-09-15 00:05:56] 28. │ └─future:::signalEarly(x, ...)
ERROR [2020-09-15 00:05:56] 29. │ ├─future::result(future)
ERROR [2020-09-15 00:05:56] 30. │ └─future:::result.MulticoreFuture(future)
ERROR [2020-09-15 00:05:56] 31. │ └─future:::FutureRegistry(...)
ERROR [2020-09-15 00:05:56] 32. │ └─future:::collectValues(where, futures = futures[idxs], firstOnly = FALSE)
ERROR [2020-09-15 00:05:56] 33. │ ├─future::resolved(future, run = FALSE)
ERROR [2020-09-15 00:05:56] 34. │ └─future:::resolved.MulticoreFuture(future, run = FALSE)
ERROR [2020-09-15 00:05:56] 35. │ └─future:::signalEarly(x, ...)
ERROR [2020-09-15 00:05:56] 36. │ ├─future::result(future)
ERROR [2020-09-15 00:05:56] 37. │ └─future:::result.MulticoreFuture(future)
ERROR [2020-09-15 00:05:56] 38. │ └─base::stop(ex)
ERROR [2020-09-15 00:05:56] 39. └─(function (e) ...
ERROR [2020-09-15 00:05:56] cases: Failed to retrieve the result of MulticoreFuture (future_lapply-72) from the forked worker (on localhost; PID 1620). Post-mortem diagnostic: No process exists with this PID, i.e. the forked localhost worker is no longer alive. -
WARN [2020-09-15 00:05:56] simpleWarning in outcome[[location$name]]$start <- start: Coercing LHS to a list
Any information in the logs as to why this might be @joeHickson?
EpiNow2
has large differences in runtime between regions. Some of this may be because of regions being fit in which there is very little to know data. Identifying these regions is the first step to fixing this behaviour.
It would be better (in circumstances where there is some instability) to make sure regional and national estimates run correctly and then do subnational estimates.
To overcome an earlier runtime issue I commented out previously implemented error catching and replaced it with purrr::safely
this should probably be replaced with a proper error catcher that reports to the log. See here:
covid-rt-estimates/R/run-region-updates.R
Line 71 in 8e2d04a
We now have a regularly updated runtimes.csv
that contains a large amount of information on the status of updates. Automated analysis could be performed on this to give the last update date for each region, number of subregions in each region that timed out for that update and their last sucessful update etc. This could then be extended in multiple directions to help track the performance of EpiNow2
and provide information for future model developments.
Potentially some of this analysis could be pushed to epiforecasts.io/covid so that more casual users can easily track the status of updates.
Currently, we use a flat filter on all datasets to only start using data after 3 days since first reported. This is because in some datasets updates occur over time that change reported case counts. If used without truncation these datasets lead to estimates that are biased downwards which needs to be avoided. However, many datasets do not have this issue, and having this be a flat cut-off limit how real-time estimates can be.
There are two solutions to this:
Reviewing each dataset to check if they show evidence of this behaviour and setting the cut-off accordingly. @kathsherratt has been doing a lot of work on the data and may have some thoughts on how feasible this is.
Scan the data over multiple days (as a copy of the reported data is kept in the summary folder as reported_cases.csv
) and detect if this is happening and set the truncation dynamically for each dataset accordingly. @joeHickson this would be more involved but also obviously more robust + improve real-time performance. We can't really do this until we start storing multiple datasets.
looks like a bug - again any information in the logs @joeHickson?
For the global estimates, we currently provide estimates for both cases and deaths. On the sub-national level we also have both these data sources in many cases and it would be useful to easily be able to support.
follows on from #71
Update the smg.md with latest parameters
When estimates are run in a list using bin/update-via-docker.sh
national estimates run without issue. Subnational estimates all hang with 0 CPU usage.
The docker logs indicate that the code has reached the EpiNow2::epinow function (i.e each region is running).
I have been unable to reproduce this in R, in bash, or via docker (replacing the docker run command in bin/update-via-docker.sh
with a Rscript call).
Any ideas on this issue would be helpful as currently the only blocker to running a full update using EpiNow2.
blocks #9
Specifying the root path allows for shifting the data output into a scratch location away from the git checkout. This location will then be used to publish the data out elsewhere.
Currently this will default to the existing location but potentially this should shift to default to ./data/generated and be added to the gitignore. This final move would also suggest shuffling the current .rds files from ./data to ./data/source or ./data/reference to help keep the data folder tidy.
Just released to GitHub master and so may have some teething problems. The scale of this update means we need to test it before pushing into production.
Most changes are interface-related and walked through in the README.
In run-regional-updates handle a new csv (status.csv?) that contains the latest summary for each location when processing the run outcome. This may also help with #54 .
Perhaps the following columns:
Dataset | subregion | last run timestamp | last run status | latest results generated at | latest results data up to |
---|---|---|---|---|---|
united-kingdom | * | 2020-09-30 13:45:53 | No New Data Available | 2020-09-29 05:34:23 | 2020-09-25 |
united-kingdom | London | 2020-09-30 13:45:53 | No New Data Available | 2020-09-29 05:34:23 | 2020-09-25 |
last run status = Success | Error | Timed Out | No New Data Available (I think we should be able to detect all these conditions)
I see a mismatch between timestamps of estimates and summaries in Russia. This indicates an uncaught and unlogged error. The issue is likely an edge case in EpiNow2.
this was commented out as it was failing, masking the error with it's own error...
At the moment estimates are stored in git and overwritten with each new update. This causes two issues: 1. Historic estimates are not available and 2. the size of the git repo grows over time meaning that periodically the history must be cleaned.
The ideal solution to this would allow programmatically pushing results for each update with an easy to use bash
friendly login process.
There is a policy interest in having global /higher than country-level estimates. These can either be implemented in the current case and death scripts or split out into their own processing stage. The best option is likely to split out as higher-level estimates will otherwise dominate the summary plots.
The UK case data is problematic at the moment due to the underlying increase in cases and the breakdown of reliable testing data. This instability increases the difficulty of fitting the model. Given this, it looks like the timeout needs to be increased in order to provide estimates
Workaround for running none interactive docker login is throwing an error.
This is used to provide a single line command to update the estimates on a newly instanced remote worker.
See here for bash script: https://github.com/epiforecasts/covid-rt-estimates/blob/master/bin/update-via-ssh.sh
R error messages can be output to a log file (https://stackoverflow.com/questions/44712959/error-handling-and-logging-in-r) - this may make debugging easier without having to search through code in interactive sessions.
This delay needs to be updated based on literature/public estimates. @kathsherratt do you have any ideas on this? When I checked the line-list I saw just 6 observations with complete data.
Subnational estimates are no longer updating and the logs show no information on where the issue could be.
We are interested in supporting an increased range of subnational estimates. Ideally, contributors will be able to do the majority of the linking work in order for the core team to focus on optimizing the code and theoretical considerations. In order for this to be possible, we need several things.
covidregionaldata
and in particular the SMG which outlines the steps required for adding subnational datasets. This step is required in order for us to support estimates.Starting a ticket to highlight problem locations that are having issues (from runtimes.csv)
Timeouts:
dataset subregion start_date runtime
belgium Brussels 15/09/2020 16:13 999999
belgium Flanders 15/09/2020 16:13 999999
belgium Unknown 15/09/2020 16:13 999999
belgium Wallonia 15/09/2020 16:13 999999
brazil São Paulo 15/09/2020 16:31 999999
canada New Brunswick 15/09/2020 18:33 999999
germany Baden-Württemberg 15/09/2020 20:58 999999
india Manipur 15/09/2020 22:10 999999
italy Lombardia 17/09/2020 05:58 999999
italy Trentino-Alto Adige 17/09/2020 05:58 999999
united-kingdom East Midlands 17/09/2020 07:59 999999
united-kingdom North West 17/09/2020 07:59 999999
united-kingdom Yorkshire and The Humber 17/09/2020 07:59 999999
I'm re-running Belgium now to see if it's still an issue
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.