Code Monkey home page Code Monkey logo

nhanes-r-programming's Introduction

National Health and Nutrition Examination Survey (NHANES) R-programming

nhanes

Here is a Link to the NHANES website so you can first learn about this data repository: NHANES Homepage

A understanding of their metadata repository is highly encourged

Who can benifit from this code repository?

This is to help anybody that would like to conduct research using the National Health and Nutrition Examination Survey (NHANES) data repository using R. In this project we go from: downloading our data directly from the NHANES data repository, using a R package called RNHANES , to cleaning our data, imputing missing values using a R package called mi, providing descriptive statistics with graphics, and using logistic regression to try and forecast our results.

SQL R nhanes 1999-2018 walking impairment

This file downloads raw data files from 1999-2018 and merges them into a "master" file that is the first start to any research project(After a proper literature review!); NHANES breaks down their data into years and within each year they further break down their datasets into questions that are similar to one another. This is useful once you can navigate the site and once you get a thorough understanding of their metadata repository. We join our different datasets using basic SQL commands and output our datafile as a CSV file.

Note: The RNHANES package losses suppourt after the year 2014, so we have to manually download SAS files and work with these files within the R enviorment.

Descriptive statistics/Inferential Statistics/ Graphics using ggplot2

In this file we conduct statistical inference in the form of: Chi- square test of equal proportions, Chi-square test of independence and we create a correlation matrix of all variables in our dataset. We also wish to summarize our dataset in the form of descriptive statistics; we do this by visualizing our dataset in the form of waffle plots (square pie charts) and box plots using ggplot2.

Here are some of examples of data vizualizations that we will create using this code:

figure1h_walking impairment

waffle plot

figure2_correlation_matrix

figure1a_age

Logistic regression

In this file we conduct binary logistic regression to try and forecast the odds of a person experiencing walking impairment in their lifetime. Our logistic regression model is a function of other possible dependent demographic variables that represent our population of interest. These variables are associated with walking impairment, such as diabetes and gender. These variables are important because having one or more of these characteristics increases the odds of you having walking impairment in your lifetime. We come up with a statistically significant model as well as provide some model performance metrics Such as a ROC curve (as seen below) to gauge the predictability of our model.

figure3_roc_curve

Machine learning models/ General Linear Model

In these files we try and put together some machine learning models and one GLM and try and predict walking impairment in an individual. These models include:

1) Decision Trees

2) Neural Networks

3) Random Forrest (partykit R:package)

4) Random Forrst

5) General Linear model: Ordinal Logistic Regression

nhanes-r-programming's People

Contributors

john-m-burleson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.