Comments (22)
just starting it now
from covid-rt-estimates.
I was contemplating this earlier - the raw sets of results are valid and good but re-assembling them would be helpful. I can see this possibly applying to multiple datasets at summary and also at subregion level.
from covid-rt-estimates.
yes, I think hard coding for now and expanding later would be a good idea. Providing some kind of all estimate summary csv would no doubt be sensible later as well.
from covid-rt-estimates.
Thanks both. I'll have a go writing this as an independent function and stick it in a PR for edits as needed.
from covid-rt-estimates.
As discussed in the conference call we will use a test if united-kingdom-admissions is in the include list to trigger the function after it's processed the data
https://github.com/epiforecasts/covid-rt-estimates/blob/master/R/run-region-updates.R#L31
if("united-kingdom-admissions" %in% includes){
callSamsFunction()
}
Whilst a bit nasty we know this will be the last data set to be processed for the uk and our current production environment triggers each in turn.
This will need changing when we move to parallel machines for data processing
from covid-rt-estimates.
Error in data.table::fwrite(df, here::here("subnational", name, "collated", :
No such file or directory: '/home/covid/subnational/united-kingdom/collated/rt/2020-10-08.csv'. Unable to create new file for writing (it does not exist already). Do you have permission to
write here, is there space on the disk and does the path exist?
Calls: run_regional_updates -> collate_estimates ->
In addition: Warning messages:
1: In storage.mode(default) <- type :
NAs introduced by coercion to integer range
2: In dir.create(here::here("subnational", name, "collated", target)) :
cannot create dir '/home/covid/subnational/united-kingdom/collated/rt', reason 'No such file or directory'
from covid-rt-estimates.
Hi @kathsherratt,
This all seems pretty sensible and I think controlling the end product here explicitly is quite important. I might like to call the target folder summary
as we can then put other things in it at a later date potentially. I would also suggest using an rt folder in that and adding dated csvs to it rather than adding the dated csv's directly.
@joeHickson for these sort of ad-hoc jobs that don't fit the general infrastructure does it make sense to just set up a daily CRON job to take whatever is sitting in the target folders and produce this?
from covid-rt-estimates.
yes, agree.
from covid-rt-estimates.
one option is to say this is a publication task and bundle it with website but equally I could be tempted to add a new function here:
https://github.com/epiforecasts/covid-rt-estimates/blob/master/R/run-region-updates.R#L31
something like "collate variants" that takes any multi-subfolder (e.g. /cases/ + /deaths/ ) and brings them together into a /collated/ variant. It could run at the end of every loop and we rely on github to go "no change" to the folders most of the time.
something like
x/cases/...
x/deaths/...
goes to
x/collated/summary with cols cases, deaths copied from the path naming?
from covid-rt-estimates.
yes I was considering bundling it with the website but I really don't want to go adding more data/links to this repo there (my hope is to drop the need to clone the covid-rt-estimates
repo and to get the website building just from summary csvs
(obviously nowhere near that yet).
Yes that all makes sense but my preference would be to keep all the estimates from a given place (i.e the UK) in the same folder. I can see how that would be less generalizable long term potential but I think will be much better for discoverability.
from covid-rt-estimates.
so my thought is you would have:
united-kingdom/cases/national/<>/data
united-kingdom/deaths/national/<>/data
united-kingdom/collated/national/<>/data
and
united-kingdom/cases/summary/<>/data
united-kingdom/deaths/summary/<>/data
united-kingdom/collated/summary/<>/data
effectively causing the collated results to appear as a separate set of results
from covid-rt-estimates.
this could also apply to the national / region folders to give you collated deaths / cases.
so collate national/<.type >/..., region/<.type >/... and subnational/*/<.type >, only collating where count(unique(type[excluding "collated"])) > 1
perhaps just putting the hardcoded option in as a quick fix is the short term solution? scanning the directories and looping through shouldn't be too difficult but it will take a bit of fiddling to get the collation correct.
from covid-rt-estimates.
This is now in place in the form of a function but needs to be linked up to the infrastructure.
@kathsherratt in the estimates I see different lengths of time-series any ideas?
from covid-rt-estimates.
Sounds spot on. Call should be:
collate_estimates(name = "united-kingdom", target = "rt")
If you have a local version of this running or access to the server giving this a manual run today and publishing the estimates would be a useful test.
from covid-rt-estimates.
@seabbs is there a branch for that?
from covid-rt-estimates.
@kathsherratt PR'd and I reviewed it so it's in master (in R/utils.R
) but uncalled.
from covid-rt-estimates.
@joeHickson any idea on a timeline for this to start being scheduled or ideas on actions I can take on our end to get this output?
from covid-rt-estimates.
awesome!
from covid-rt-estimates.
sounds like it needs a recursive = TRUE
set. One sec. Yes that is the issue I think
from covid-rt-estimates.
I'll pop that in now
from covid-rt-estimates.
Done here #68 but may be faster to do elsewhere.
from covid-rt-estimates.
ok it's in and generating but being ignored because I think it's tripping up the git ignore date filter
from covid-rt-estimates.
Related Issues (20)
- UK not updated - no error thrown or attempt flagged HOT 1
- ECDC data source changing
- Update R6 list definition to only require setting any defaults once. HOT 2
- Add state level estimates for South Africa HOT 2
- Update documentation to reflect new datasets HOT 1
- Add Chris to the DESCRIPTION HOT 1
- Evaluate switching to OSF for archiving results. HOT 1
- Improve log utility
- Missing data - Saudi Arabia HOT 2
- Handle no new data more elegantly
- Add dated summary folders to Azure blob storage.
- Improve access to blob storage and document
- Docker image does not work as described by README HOT 1
- Include option to edit 'output' in arguments for run_regional_updates.R
- Broken link
- Get National Reproduction Number of 2020 HOT 5
- UK UTLA level estimates not always updating HOT 5
- Middle East Data sets not updating HOT 2
- overnight run failed - optparse package has vanished
- Data not updating for 1 week HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from covid-rt-estimates.