Code Monkey home page Code Monkey logo

au_covid19's Introduction

AU_COVID19

Time series of confirmed COVID-19 cases for Australian states, originally from and still cross-checked against covid19data.com.au, compiled primarily by Juliette O'Brien. When case numbers are reported at differing times of day, there may be differences between my data and that site. I am trying to use my judgement to make the time series as consistent as possible, but the data is inherently messy and you shouldn't necessarily trust every daily percentage change for every state.

Notes on NSW

  • On 3 July, NSW reported 189 historical cases from cthe Ruby Princess to the federal government, which now appear in NSW's totals published by the federal Department of Health, but not by NSW Health (the cases โ€“ all crew members โ€“ were diagnosed and managed on the ship). My numbers follow the NSW Health website, so exclude these cases.

  • The NSW sources of infection are from Data.NSW's CSV file, which now contains the full case dataset, up to several days ago ("Publication of some data in this dataset is being delayed because the risk of gaining information about an individual in the dataset increases as the number of cases decreases"). The following bullet points are no longer relevant unless you're trying to piece together the history of time_series_nsw_sources.csv. Update 4 June On 3 June, the dates in the CSV file for all or most cases were changed. I believe that previously, the date was the date of test sample, and now I don't know, but it might now be date of test analysis or something related.

  • The NSW sources figures prior to 9 March are extracted from the epidemiological curve graph at the NSW Health statistics page. My extraction code isn't necessarily precise (the axis ticks may be two pixels tall, for example), and there can be small discrepancies between my totals for each source of infection and the totals reported by NSW Health. On 3 April, the published graph stopped distinguishing between 'locally acquired from unknown source' and 'locally acquired from a known contact/cluster', and I have used the graph published on 2 April for these early dates, which will probably cause minor inconsistencies as this early data is still occasionally revised. I have arbitrarily placed two cases (1 March and 7 March) in the amalgamated 'local' category into 'local unknown'.

  • Numbers since 9 March are counted from the CSV file at Data.NSW There are much larger discrepancies between the number of cases as shown in the epidemiological curve and the number of confirmed cases in time_series_cases.csv, I think because the date reported in this file (and graph) is the date that the sample was taken, and there can be quite a lag between the sample being taken and it being analysed to a positive result. Expect the numbers for the last few days to be substantially revised upwards as the backlog of samples is tested; data from earlier days is also subject to revision.

  • On 11 April, many cases previously classified as 'Locally acquired - contact not identified' were reclassified into the new category 'Overseas or interstate'. On 14 April, a separate 'Interstate' category was introduced into the source CSV; this standalone category doesn't exist in the graphs, so my data does not show any interstate infections prior to 9 March, even though some may be recorded as such internally at NSW Health.

  • On 21 March, NSW changed from reporting case numbers as of 11am to case numbers as of 8pm the previous evening.

Thanks to @tetrakazi for her scraper of the Victorian data, which I have adapted for the sources of infections for that state. I don't know why our time series don't agree.

Prior to 21 April, a bug in my parsing scripts meant that there were occasional errors in time_series_vic_sources.csv and time_series_act_sources.csv. From 21 April, the total numbers of reported cases should tally correctly with time_series_cases.csv.

time_series_wa_sources.csv presents cumulative totals with the date supposedly being "optimal date of onset", but sometimes the numbers go down, which I don't understand. Some cases recorded as local contact are, according to media releases, of people in hotel quarantine.

WA counted historical cases identified through serology testing in its case count until 1 August. My numbers follow the WA dashboard, so 26 of these historical cases were removed from the tally on that date.

As of 2021-05-21, the ACT dashboard has five cases without a source of infection; based on the federal dashboard numbers, these appear to be interstate-acquired, and are now classified in time_series_act_sources.csv as such.

In time_series_tests.csv:

  • WA's figures are persons tested until 30 April; from 1 May they are tests performed.
  • NSW's figures are persons tested until 25 May; from 26 May they are tests performed.
  • Other states are (I believe) all tests performed.
  • On 6 June, Victoria's number fell by about 12,000 after removing duplicated data.
  • Qld added about 38,500 tests from a private provider on 22 June.
  • SA added tests from a private provider on 29 July.

As of 23 April, relevant state health department links:

NSW: COVID-19 page, Source of infections CSV

Vic: Dashboard

Qld: Statistics

SA: Dashboard

WA: Dashboard

Tas: Statistics; my daily updates used to follow the (discontinued as of 12 June) evening case announcements, usually tweeted by Monte Bovill (ABC), Emily Jarvie (Advocate/Examiner), and others.

ACT: Dashboard

NT: COVID-19 page

The federal government's statistics page has some testing statistics not always released by the state health departments.

au_covid19's People

Contributors

pappubahry avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.