Code Monkey home page Code Monkey logo

cdc_overdose's Introduction

Code for "COVID-19 and the Drug Overdose Crisis: Uncovering the Deadliest Months in the United States, January-July 2020"

Paper:

Friedman, J. & Akre, S. COVID-19 and the Drug Overdose Crisis: Uncovering the Deadliest Months in the United States, January‒July 2020. Am J Public Health e1–e8 (2021). doi:10.2105/ajph.2021.306256

Updated 07-17-2021

  • Overdose data by state for analysis purposes in: Useable_Overdose_Data_Through_Oct_2020.csv

Monthly Overdose Deaths: United States

Overdose Deaths

Monthly Overdose Deaths: Select U.S. States

Overdose Deaths - state level

Directory Layout

  • src/: Contains python scripts used in analysis
    • defilter_data.py: Disaggregates rolling sum data
    • estimate_error.py: Calculates error in disaggregation process from historical data
    • load_data.py: Cleans and preprocesses input datasets
    • tests/: Unit tests to ensure desired functions work as intended
  • input/: Input datafiles used for analysis
  • output/: Generated intermediate files
  • visuals/: Generated figures and tables
    • state_timeseries_2021-05-16.pdf: State by state data
  • CDC_overdose_monthly_recovery.ipynb: Python notebook used to disaggregate CDC overdose death data and calculate errors in that process
  • CDC_monthly_imputation.R: Imputes missing data from monthly "ground truth" overdose data
  • CDC_monthly_recovery_analysis.R: Run analysis and visualizations of monthly overdose data. Used to generate figures in manuscript.
  • Useable_Overdose_Data_Through_Sep_2020.csv: Cleaned monthly overdose death data by state. Intended for use by others.

Programming Environment

Python 3.7.6

  • pandas==1.2.0
  • numpy==1.19.4
  • scipy==1.5.4
  • matplotlib==3.3.3
  • seaborn==0.11.1
  • jupyterlab==3.0.0 .

R 4.0.3

Steps For Analysis

  1. Set up R and python environments
  2. Create an empty folder output and visuals inside of repository
    • Intermediate files and manuscript figures will be placed here
  3. Run CDC_monthly_imputation.R
    • Make sure to change root variable to point to location of this repository on your local system
  4. Run the full CDC_overdose_monthly_recovery.ipynb python notebook
  5. Run CDC_monthly_recovery_analysis.R
    • Make sure to change root variable to point to location of this repository on your local system

cdc_overdose's People

Contributors

akre96 avatar privaterra avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cdc_overdose's Issues

Revisions to get the code to work in RStudio

I have finally have been able to get the code to in RStudio (more or less) in a Docker instance. Here’s a couple of changes/revisions I’ve had to make to get it to work:

  1. In regards to the python notebook, I couldn’t get it to work…

so I converted - CDC_overdose_monthly_recovery.ipynb - to a python script using the command below

CDC_overdose_monthly_recovery.ipynb --to script
Then, I opened a terminal window in RStudio and ran it with - python3 CDC_overdose_monthly_recovery.py -

  1. “CDC_monthly_recovery_analysis.R” doesn’t exist for me. So instead I reviewed/revised ,“CDC_monthly_recovery.R”

Hardcoded directories are still in the code, so it won’t run “out of the box”… Make the following changes to get things working:

  • Line 14 & 16 , change - /data/CDC_2020 to /input/
  • line 45, change - /data/CDC_2020 - to /output/
  • Line 48, change - /data/CDC_2020 to /output/
  • Line 61, change - /data/CDC_2020 to /input/
  • Line 92, change - /data/CDC_2020 - to /input/
  • Line 104, change - /data/CDC_2020 - to /output/
  • Line 458, change - /data/CDC_2020/CDC_ts_July2020.csv to /input/CDC_ts.csv
  1. The - gpclib - package needs to be installed

install.packages("gpclib", type="source")
(as per - https://stackoverflow.com/questions/30790036/error-istruegpclibpermitstatus-is-not-true )

  1. I updated the root path to /home/rstudio/cdc_overdose

The code was then able to run and generate the output files & visuals

hard coded directories - convert to relative ?

I'm trying to run the great R code you have developed to generate the graphs in the report. However, i'm running into an issue that there are several hard coded directory files in The CDC_monthly_recovery_analysis.R.

/data/CDC_2020/

If that works, then I want to add Oklahoma to the list of graphs. It an important member of congress is from that State...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.