Code Monkey home page Code Monkey logo

texas-covid's Introduction

Summary

Daily publication of cleaned and tidy Texas county-level Covid-19 statistics, as published by Texas DSHS.
Original data sourced from https://www.dshs.state.tx.us/coronavirus/additionaldata/; ugly excel, beware.

Tidy data can be accessed here:

Data has been cleaned at put in a long format for easy visualization and modeling.

All data-tables have the following fields:

  1. "County": Texas county name
  2. "Date": Date associated with observation, YYYY-MM-DD format.
  3. "DailyCount": Aggregate measure, to-date, as published by DSHS.
  4. "DailyDelta": Calculated daily measure ($x_{t} - x_{t-1}$) to get e.g. new cases for a given day
  5. "LastUpdateDate": Date when the data was pulled.

DSHS updates data everyday around ~9:30am CST, tidy-data is then updated at 10:30am CST.  

Getting Data

Read data from github link.

dat = read_csv(file = "https://raw.githubusercontent.com/nikolkj/Texas-Covid/master/daily-county-data/Texas-County-Cases.csv", col_names = TRUE, progress = FALSE)
## Parsed with column specification:
## cols(
##   County = col_character(),
##   Date = col_date(format = ""),
##   DailyCount = col_double(),
##   DailyDelta = col_double(),
##   LastUpdateDate = col_date(format = "")
## )

Examine some data sample.

dat %>% 
  filter(Date > "2020-04-15", DailyCount > 100) %>%
  sample_n(15) %>% 
  kable() %>% kableExtra::kable_styling(kable_input = ., bootstrap_options = c("striped", "hover"))
County Date DailyCount DailyDelta LastUpdateDate
Collin 2020-05-21 1090 17 2020-06-10
Kaufman 2020-05-10 116 0 2020-06-10
Grayson 2020-06-03 350 8 2020-06-10
Hidalgo 2020-05-07 359 6 2020-06-10
Montgomery 2020-06-07 1064 0 2020-06-10
Hardin 2020-05-23 136 11 2020-06-10
Potter 2020-05-21 2196 3 2020-06-10
Bowie 2020-06-05 301 5 2020-06-10
Randall 2020-05-17 602 9 2020-06-10
Bell 2020-05-15 242 5 2020-06-10
Taylor 2020-05-02 327 8 2020-06-10
Harris 2020-06-02 12664 388 2020-06-10
Hays 2020-06-07 385 0 2020-06-10
Hardin 2020-05-30 138 0 2020-06-10
Coryell 2020-05-15 221 1 2020-06-10

Reporting Data

Find when new cases peaked for each county, take top 10.

dat %>% group_by(County) %>%
  filter(DailyDelta == max(DailyDelta, na.rm = T)) %>%
  rename(PeakDate = Date, PeakCases = DailyDelta) %>%
  arrange(desc(PeakCases)) %>% head(n = 10) %>% 
  select(County, PeakDate, PeakCases) %>%
  kable() %>% kableExtra::kable_styling(kable_input = ., bootstrap_options = c("striped", "hover"), full_width = FALSE, position = "left")
County PeakDate PeakCases
Harris 2020-04-10 706
Potter 2020-05-16 618
Walker 2020-05-31 510
Tarrant 2020-05-11 485
Dallas 2020-05-22 369
Jones 2020-05-28 222
El Paso 2020-06-04 197
Bexar 2020-05-31 189
Moore 2020-06-02 149
Medina 2020-06-06 138

Plotting Data

dat %>%
  filter(!is.na(DailyDelta), 
         County %in% c("Harris","Dallas","Bexar","Walker")) %>%
  mutate(County = factor(County)) %>%
  select(County, Date, DailyDelta) %>% 
  ggplot(data = ., mapping = aes(x = Date, y = DailyDelta, col = County)) +
  geom_line() + 
  ggtitle("New Cases", subtitle = "For select counties") +
  ylab("") + xlab("") +
  scale_x_date(labels = scales::date_format(format = "%m/%d")) + 
  ggthemes::theme_fivethirtyeight()

dat %>% 
  filter(County %in% c("Harris","Dallas","Bexar","Walker"),
         DailyCount > 0,
         Date > "2020-03-15") %>%
  mutate(County = factor(County)) %>%
  select(County, Date, DailyCount) %>% 
  ggplot(data = ., mapping = aes(x = Date, y = DailyCount, col = County)) +
  geom_line() + 
  ggtitle("Total Cases", subtitle = "For select counties") +
  ylab("") + xlab("") +
  scale_y_continuous(na.value = 0, trans = "log10", labels = scales::number_format(big.mark = ",", accuracy = 1)) +
  scale_x_date(labels = scales::date_format(format = "%m/%d")) + 
  ggthemes::theme_fivethirtyeight()

      

texas-covid's People

Contributors

nikolkj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.