Code Monkey home page Code Monkey logo

covid_case_study_with_learnr's Introduction

Introduction to R for Applied Epidemiology

This website hosts training materials for “Introduction to R for Applied Epidemiology”. This course teaches the fundamentals of R for applied epidemiologists and public health practitioners.

Applied Epi is a nonprofit organization supporting frontline practitioners through open-course analytical tools, training, and support. Our Epidemiologist R Handbook is a free R reference manual which has been used by 150,000 people around the world.

Initial setup

Follow these step-by-step instructions to download the course files, setup an RStudio project, and to download and begin the course’s interactive exercises.

Click each of the steps below to expand the instructions

  1. Download course files

Click here to download a zipped folder to use in the course exercises.

Unzip the folder and save it on your computer’s desktop (not on a shared drive). To “unzip” a folder once it is downloaded, right-click on the folder and select “Extract All”. If offered a choice of location to save the unzipped folder, save it to your desktop.

If you are unable to download, try this link, or ask your course organizer for an emailed version.

  1. Create a new RStudio project
  1. Open RStudio. Ensure that you open RStudio and not just R.

  2. In RStudio click File -> New Project. In the pop-up window, select “Existing directory”.

  1. Click “browse” and select the “intro_course” folder on your desktop, that you downloaded earlier, which contains the course materials.

  2. Click “Create project”

Voila! This will be the project for ALL of your work in this course.

  1. Access Applied Epi course exercises

Ask your instructor for the link to the online interactive exercises. If you have unreliable internet, ask your instructor for assistance.

  1. Begin the first exercise

Now, the course exercises will appear within your RStudio. Each course module has a corresponding exercise, which can be accessed through the “Tutorials” pane in RStudio (upper-left). The gif below introduces you to the exercise environment (you do not need to follow the steps shown right now).

  1. Click on the “Tutorial” tab in the upper-right RStudio pane (which also contains a tab holding your “Environment”).
  • Scroll down and review the listed exercises. If you do not see any “Applied Epi” exercises listed, close and re-open RStudio. They may take a minute to appear.
  1. Select the exercise “Applied Epi - R setup, syntax, data import”
  • The exercise will load. Once you see the Applied Epi logo appear in the Tutorials pane, you can begin the exercise.
  • To see the sidebar in the exercise, you may need to adjust the Tutorials pane to be wider. You can also adjust the zoom from the “View” menu.
  • You can view the exercise in this pane, or click the small icon in the upper-left to pop-out into a separate window.

Modules

Module 1: Introduction to R

We welcome you to the course and dive into the basics of how to interact with R and RStudio, basic R syntax, and how to organize your analytical projects using public health examples. We then cover R functions and packages, and introduce the core functions used to import data. Using these, we import the Ebola case study surveillance linelist, and begin to inspect and review it.

Module 2: Data cleaning

Now that we have our surveillance linelist in R, we cover what “data cleaning” steps are necessary and how to execute these in R. Along the way, we introduce many of the core R functions including adjusting column names, deduplicating and filtering rows, selecting and modifying columns, recoding values, and more. Together, we write a sequence of “pipes” to clean the linelist step-by-step in a clear, reproducible manner… so that our dataset is ready for preliminary analysis!

Module 3: Grouping data and making summary tables

Informative tables are the bedrock of epidemiological and public health practice. In this module we introduce three tools to produce tables of summary statistics: {dplyr} for flexibility, {janitor} for speed, and {gtsummary} for beauty. Finally, we explore {flextable}, which can be used to beautify any of the above approaches, add colors and highlights, and save tables to Word, PNG, HTML, etc.

Module 4: Data visualization with {ggplot2}

Using the {ggplot2} package to maximum effect rests upon understanding how to apply its “grammar of graphics” to build a plot layer-by-layer. We tackle this by introducing the grammer piece-by-piece so that you build upon previous knowledge to construct informative and colorful bar plots, scatter plots, histograms, line plots, text plot labels that automatically refresh with updated data (very useful for epidemiological reports!), and more.

Module 5: Transforming data

Public health analytics rarely involves just one data set, so now we practice joining data by adding hospital, laboratory, and case investigation data to our surveillance linelist. We ingrain best practices for conducting joins, and prepare you for doing data transformations independently. In the second part of this module, we address pivoting, which in R means transforming data between “long” and “wide” formats. This is particularly relevant in public health, where each format has distinct benefits.

Module 6: More data visualization with {ggplot2}

In this second data visualization module we encourage you to practice learning R independently (a necessary skill once you leave the class!) but with our support. We tackle visualizations that are central to descriptive epidemiology: the intricacies of crafting an accurate epidemic curve, conveying patterns in three variables using a heat plot, and creating age/sex pyramids to describe demographics. If there is time, we finish with a demonstration of R’s GIS/geospatial capabilities.

  • Exercise:

Module 7: Routine reports with R Markdown

In this module, we take the R code on the Ebola case study that you have been building throughout the course and convert it into a reproducible, automated report (Word, PDF, HTML, etc.). We teach you the variations in syntax and opportunities that lie in being able to produce documents that update when incoming data is refreshed, that look professional, and can be sent to inform public health partners and stakeholders.

Module 8: Final exercise and code review

In this last module, your skills are tested as you have to produce an R Markdown report using a COVID-19 case linelist. Unlike with the Ebola case study, you will not have the answer code available to you. When you finish, we perform “code reviews”, simultaneously improving your coding skills and teaching you how to review others’ code. Before closing, we touch upon how to find your particular community of R users, resources available to you for questions, and close with a feedback survey.

  • Slides: COVID case study
  • Exercise materials: See the folder “learning_materials/covid_case_study” for the Word document report to replicate, the data, and a tip sheet.

Sustained support

Our instructors know public health. One of the signature features of Applied Epi’s training is that we provide follow-up support to your team, to help you apply your new skills to your work context.

We schedule five 1.5-hour sessions with your team at in the 3 months post-training. In these sessions, we help you troubleshoot code, advise you on analytical strategies, or guide you in new learning that you need.

Notes

  • Please note that all of our case study training materials use fake example data in which no person is identifiable and the actual values have been scrambled.
  • Modifications are possible so that the course uses data from your jurisdiction. Email us at [email protected] us to discuss.

Acknowledgements

Authors and contributors to this course curriculum from Applied Epi include:

  • Neale Batra

  • Arran Hamlet

  • Mathilde Mousset

  • Alex Spina

  • Paula Blomquist

  • Amy Mikhail

  • The Fulton County Board of Health graciously provided example data (anonymized and scrambled) for a case study.

  • The {outbreaks} package formed the basis for the fake dataset in the Ebola case study.

Terms of Use and License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Please email [email protected] if you could like to use these materials for an academic course or epidemiologist training program.

Contribution

If you would like to make a content contribution, please contact with us first via Github issues or by email. We are implementing a schedule for updates and are creating a contributor guide.

Please note that the Epi R Handbook project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

covid_case_study_with_learnr's People

Contributors

nsbatra avatar lbaertlein1 avatar aspina7 avatar amymikhail avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.