There seems to be an issue with the updates. Is that on purpose?

IMHO it makes sense to have daily numbers <p dir="auto"

No updates for 4 days about covid-19-germany-gae HOT 17 CLOSED

jgehrcke commented on May 13, 2024

No updates for 4 days

from covid-19-germany-gae.

Comments (17)

gerbsen commented on May 13, 2024 1

I want to have one source for both cases and deaths. Having multiple source with different dates is confusing. I also need daily updates for the current date, since I want to show other decision makers current numbers. Also the risklayer file seems to have duplicate days per line (as shown in your example above). So the easiest thing for me would be to copy the code which generates the csv from your repo to my dashboard, I guess.

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Well, I don't think there is an issue (or at least we didn't properly describe it yet), the slight delay is "on purpose" as in "it's fine" as in "it encourages to ask the right questions", as you did here. Thanks for asking.

I'd love you to have a look at #93. It makes sense to look at the RKI timeseries data only up to ~3 days ago -- the RKI data for "the last few days" are a little skewed.

To make this a little easier to understand: the RKI data for today should be looked at in about a week, only then it's not really subject to changes anymore :-).

I might look into doing a daily update here, but then people look at today's data without asking good questions.

What do you think?

from covid-19-germany-gae.

gerbsen commented on May 13, 2024

IMHO it makes sense to have daily numbers and redraw and annotate graphs (danger: past values can change)

…

Am 22.09.2020 um 11:39 schrieb Dr. Jan-Philip Gehrcke ***@***.***>: Well, I don't think there is an issue (or at least we didn't properly describe it yet), the slight delay is "on purpose" as in "it's fine" as in "it encourages to ask the right questions", as you did here. Thanks for asking. I'd love you to have a look at #93 <#93>. It makes sense to look at the RKI timeseries data only up to ~3 days ago -- the RKI data for "the last few days" are a little skewed. To make this a little easier to understand: the RKI data for today should be looked at in about a week, only then it's not really subject to changes anymore :-). I might look into doing a daily update here, but then people look at "today's" data without asking good questions. What do you think? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#164 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJBTRLLJQVJ2MZ2FTKQ6QTSHBWDXANCNFSM4RU3T2CA>.

from covid-19-germany-gae.

gerbsen commented on May 13, 2024

Could you tell me, what I would need to do myself for the daily version?

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

IMHO it makes sense to have daily numbers

We definitely have daily numbers! :-) It's just recommendable to choose between Risklayer data and RKI data depending on which time frame you're looking at.

Use RKI-provided numbers for up to 2-3 days ago, use the Risklayer data set for today and the last 2-3 days. The tail end of the Risklayer data set for example today (Sep 29):

2020-09-28T01:00:00+0000,4703,7689,20074,2366,68653,18677,10576,49693,67398,3341,14195,4257,1154,7098,2604,4061,286539
2020-09-29T01:00:00+0000,4742,7749,20269,2385,69213,18798,10678,50048,67757,3343,14326,4266,1165,7174,2635,4070,288618
2020-09-29T09:00:00+0000,4742,7749,20269,2385,69213,18798,10678,50048,67757,3343,14326,4268,1165,7174,2635,4070,288620

straight from: https://github.com/jgehrcke/covid-19-germany-gae/blob/master/cases-rl-crowdsource-by-state.csv

Could you tell me, what I would need to do myself for the daily version?

Which data set (specific CSV file / files) are you interested in generating yourself? The tooling is all in this repository, and we can certainly try to better document how to use it.

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Can continue discussing, but closing this for now!

from covid-19-germany-gae.

gerbsen commented on May 13, 2024

Which data set (specific CSV file / files) are you interested in generating yourself? The tooling is all in this repository, and we can certainly try to better document how to use it.

It seems to me that AG_RKI_SUMS_QUERY_BASE_URL is not defined in the code.

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

I also need daily updates for the current date, since I want to show other decision makers current numbers.

I sense some frustration here :-). Note that this is a free time project!

At the same time your ask(s) is/are a little ambiguous. I'll try to address your points.

the risklayer file seems to have duplicate days per line (as shown in your example above)

"Duplicate days per line" is a funny description... aehm, let me clarify: each line reflects one data point. A data point is comprised of a timestamp, and a set of numeric values.

Now, yes, there might be more than one data point (line) per day in the RL data set. But only ever for the last day in the data set.

This is by design: as I said above, the RL data set is supposed to be recent.

The timestamp of each data point is provided with a time resolution of 1 hour. When you consume time series data then in general it's good advice to maybe not expect equidistant samples :-).

I want to have one source for both cases and deaths.

RKI and RL data are the same source: Gesundheitsaemter.

Fair enough, but why don't you use the RKI data set for that? I guess you're saying that it does not appear to be 'fresh' enough?

I get that you really want to have the latest RKI data point, and well, yeah. We can do that. :)

So the easiest thing for me would be to copy the code which generates the csv from your repo to my dashboard

You're absolutely welcome to do with the code you find in this repository whatever you'd like, subject to the License declared in the file header(s). And of course -- if you find a meaningful way to run a CPython interpreter with pandas as part of 'your dashboard' then this sounds like a great approach!

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

The RKI data files now contain the most recent data points.

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

This screenshot shows the RL data set on the left, and the RKI data set on the right, and hopefully clarifies once again how these data sets relate to one another.

For making the point the the RL data set is more accurate for the very recent past here we also see again that it makes sense that, to quote myself from above,

there might be more than one data point (line) per day in the RL data set. But only ever for the last day in the data set.

The screenshot also shows that if you're not particularly interested in today/yesterday, then the RKI data set is more credible (better view into the past).

from covid-19-germany-gae.

gerbsen commented on May 13, 2024

I also need daily updates for the current date, since I want to show other decision makers current numbers.

I sense some frustration here :-). Note that this is a free time project!

No no, sorry if this is what you heard. I'm entirely grateful for your project and help. :)

At the same time your ask(s) is/are a little ambiguous. I'll try to address your points.

the risklayer file seems to have duplicate days per line (as shown in your example above)

"Duplicate days per line" is a funny description... aehm, let me clarify: each line reflects one data point. A data point is comprised of a timestamp, and a set of numeric values.

yeah I know, that the last entry has two dates with different times. It just seemed odd to me that this is only happening for the last line. this screws up my plots a bit so I just skip the last line :)

Now, yes, there might be more than one data point (line) per day in the RL data set. But only ever for the last day in the data set.

This is by design: as I said above, the RL data set is supposed to be recent.

The timestamp of each data point is provided with a time resolution of 1 hour. When you consume time series data then in general it's good advice to maybe not expect equidistant samples :-).

I want to have one source for both cases and deaths.

RKI and RL data are the same source: Gesundheitsaemter.

Fair enough, but why don't you use the RKI data set for that? I guess you're saying that it does not appear to be 'fresh' enough?

I get that you really want to have the latest RKI data point, and well, yeah. We can do that. :)

So the easiest thing for me would be to copy the code which generates the csv from your repo to my dashboard

You're absolutely welcome to do with the code you find in this repository whatever you'd like, subject to the License declared in the file header(s). And of course -- if you find a meaningful way to run a CPython interpreter with pandas as part of 'your dashboard' then this sounds like a great approach!

Any chance you could publish the environment URLs? Also do you have a specific time when you update your files?

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Any chance you could publish the environment URLs?

please have a look at #208 :)

from covid-19-germany-gae.

gerbsen commented on May 13, 2024

Any chance you could publish the environment URLs?

please have a look at #208 :)

Thank you! Could you also publish the Risklayer URL?

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Thank you! Could you also publish the Risklayer URL?

I use the official Risklayer GmbH Google sheet, based on which I constructed a CSV export URL (using the magic ingredient pub?output=csv):

export RISKLAYER_HISTORY_CSV_URL="https://docs.google.com/spreadsheets/d/e/2PACX-1vTiKkV3Iy-BsShsK3DSUeO9Gpen7VwsXM_haCOc8avj1PeoCIWqL4Os-Uza3jWMEUgmTrEizEV-Itq5/pub?output=csv"

no guarantees implied :).

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Quick update: we're very close to having multiple updates per day in this repository now; all done automatically.

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

To make this a little easier to understand: the RKI data for today should be looked at in about a week, only then it's not really subject to changes anymore :-).

This is especially true for the cumulative count of COVID-19-attributed deaths. I have tried to explain some of that in this Twitter thread: https://twitter.com/gehrcke/status/1343602019651760129

from covid-19-germany-gae.

jgehrcke commented on May 13, 2024

Also do you have a specific time when you update your files?

There are now multiple updates per day done automatically @gerbsen.

from covid-19-germany-gae.

No updates for 4 days about covid-19-germany-gae HOT 17 CLOSED

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent