Linked to <a class="issue-link js-issue-link" data-error-text="Failed to load title" d

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

csv output from new UK data sources about covid-rt-estimates HOT 22 CLOSED

epiforecasts commented on August 11, 2024

csv output from new UK data sources

from covid-rt-estimates.

Comments (22)

joeHickson commented on August 11, 2024 3

just starting it now

from covid-rt-estimates.

joeHickson commented on August 11, 2024 1

I was contemplating this earlier - the raw sets of results are valid and good but re-assembling them would be helpful. I can see this possibly applying to multiple datasets at summary and also at subregion level.

from covid-rt-estimates.

seabbs commented on August 11, 2024 1

yes, I think hard coding for now and expanding later would be a good idea. Providing some kind of all estimate summary csv would no doubt be sensible later as well.

from covid-rt-estimates.

kathsherratt commented on August 11, 2024 1

Thanks both. I'll have a go writing this as an independent function and stick it in a PR for edits as needed.

from covid-rt-estimates.

joeHickson commented on August 11, 2024 1

As discussed in the conference call we will use a test if united-kingdom-admissions is in the include list to trigger the function after it's processed the data
https://github.com/epiforecasts/covid-rt-estimates/blob/master/R/run-region-updates.R#L31

if("united-kingdom-admissions" %in% includes){
   callSamsFunction()
}

Whilst a bit nasty we know this will be the last data set to be processed for the uk and our current production environment triggers each in turn.

This will need changing when we move to parallel machines for data processing

from covid-rt-estimates.

joeHickson commented on August 11, 2024 1

Error in data.table::fwrite(df, here::here("subnational", name, "collated", :
No such file or directory: '/home/covid/subnational/united-kingdom/collated/rt/2020-10-08.csv'. Unable to create new file for writing (it does not exist already). Do you have permission to
write here, is there space on the disk and does the path exist?
Calls: run_regional_updates -> collate_estimates ->
In addition: Warning messages:
1: In storage.mode(default) <- type :
NAs introduced by coercion to integer range
2: In dir.create(here::here("subnational", name, "collated", target)) :
cannot create dir '/home/covid/subnational/united-kingdom/collated/rt', reason 'No such file or directory'

from covid-rt-estimates.

seabbs commented on August 11, 2024

Hi @kathsherratt,

This all seems pretty sensible and I think controlling the end product here explicitly is quite important. I might like to call the target folder summary as we can then put other things in it at a later date potentially. I would also suggest using an rt folder in that and adding dated csvs to it rather than adding the dated csv's directly.

@joeHickson for these sort of ad-hoc jobs that don't fit the general infrastructure does it make sense to just set up a daily CRON job to take whatever is sitting in the target folders and produce this?

from covid-rt-estimates.

seabbs commented on August 11, 2024

yes, agree.

from covid-rt-estimates.

joeHickson commented on August 11, 2024

one option is to say this is a publication task and bundle it with website but equally I could be tempted to add a new function here:
https://github.com/epiforecasts/covid-rt-estimates/blob/master/R/run-region-updates.R#L31

something like "collate variants" that takes any multi-subfolder (e.g. /cases/ + /deaths/ ) and brings them together into a /collated/ variant. It could run at the end of every loop and we rely on github to go "no change" to the folders most of the time.

something like
x/cases/...
x/deaths/...

goes to
x/collated/summary with cols cases, deaths copied from the path naming?

from covid-rt-estimates.

seabbs commented on August 11, 2024

yes I was considering bundling it with the website but I really don't want to go adding more data/links to this repo there (my hope is to drop the need to clone the covid-rt-estimates repo and to get the website building just from summary csvs (obviously nowhere near that yet).

Yes that all makes sense but my preference would be to keep all the estimates from a given place (i.e the UK) in the same folder. I can see how that would be less generalizable long term potential but I think will be much better for discoverability.

from covid-rt-estimates.

joeHickson commented on August 11, 2024

so my thought is you would have:

united-kingdom/cases/national/<>/data
united-kingdom/deaths/national/<>/data
united-kingdom/collated/national/<>/data
and
united-kingdom/cases/summary/<>/data
united-kingdom/deaths/summary/<>/data
united-kingdom/collated/summary/<>/data

effectively causing the collated results to appear as a separate set of results

from covid-rt-estimates.

joeHickson commented on August 11, 2024

this could also apply to the national / region folders to give you collated deaths / cases.

so collate national/<.type >/..., region/<.type >/... and subnational/*/<.type >, only collating where count(unique(type[excluding "collated"])) > 1

perhaps just putting the hardcoded option in as a quick fix is the short term solution? scanning the directories and looping through shouldn't be too difficult but it will take a bit of fiddling to get the collation correct.

from covid-rt-estimates.

seabbs commented on August 11, 2024

This is now in place in the form of a function but needs to be linked up to the infrastructure.

@kathsherratt in the estimates I see different lengths of time-series any ideas?

from covid-rt-estimates.

seabbs commented on August 11, 2024

Sounds spot on. Call should be:

collate_estimates(name = "united-kingdom", target = "rt")

If you have a local version of this running or access to the server giving this a manual run today and publishing the estimates would be a useful test.

from covid-rt-estimates.

joeHickson commented on August 11, 2024

@seabbs is there a branch for that?

from covid-rt-estimates.

seabbs commented on August 11, 2024

@kathsherratt PR'd and I reviewed it so it's in master (in R/utils.R) but uncalled.

from covid-rt-estimates.

seabbs commented on August 11, 2024

@joeHickson any idea on a timeline for this to start being scheduled or ideas on actions I can take on our end to get this output?

from covid-rt-estimates.

seabbs commented on August 11, 2024

awesome!

from covid-rt-estimates.

seabbs commented on August 11, 2024

sounds like it needs a recursive = TRUE set. One sec. Yes that is the issue I think

from covid-rt-estimates.

joeHickson commented on August 11, 2024

I'll pop that in now

from covid-rt-estimates.

seabbs commented on August 11, 2024

Done here #68 but may be faster to do elsewhere.

from covid-rt-estimates.

joeHickson commented on August 11, 2024

ok it's in and generating but being ignored because I think it's tripping up the git ignore date filter

from covid-rt-estimates.

csv output from new UK data sources about covid-rt-estimates HOT 22 CLOSED

Comments (22)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent