Code Monkey home page Code Monkey logo

getandclean's Introduction

title author date output
README.md
TSBrownPhD
May 12, 2016
html_document

Getting and Cleaning Data - Week 4 Assignment

Generating a Tidy Data Set

This assignment utilizes the data files made publically available from a study named "Human Activity Recognition Using Smartphones Dataset - Version 1.0" by Jorge L. Reyes-Ortiz, Davide Anguita, Alessandro Ghio, Luca Oneto and Xavier Parra and provided as part of the course via https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

The objectives of this project were to create one R script called run_analysis.R that does the following :

  1. Merges the training and the test sets to create one data set.
  2. Extracts only the measurements on the mean and standard deviation for each measurement.
  3. Uses descriptive activity names to name the activities in the data set
  4. Appropriately labels the data set with descriptive variable names.
  5. From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject.

To achieve this, my run_anaylsis.R takes the following steps:

1. Merges the training and the test sets to create one data set.

Read in data files.

a. Set the working directory

b. Create objects to hold the file locations; makes code for reading easier

c. Read the data into dataframes

Merge the training and the test sets to create one data set.

d. Combine the complet test and train sets

e. Add names to columns in combined dataset from features.txt

2. Extracts only the measurements on the mean and standard deviation for each measurement.

a. Find all columns with "mean()" in the name

b. Find all columns with "std()" in the name

c. Column combine the corresponding columns into a new object named "meanstd"

3. Uses descriptive activity names to name the activities in the data set

a. Combine the subject numbers and activity labels

b. Name the subject label columns

c. Add human-readable activity from lable

d. Add combined subject numbers, activity labels, and human readable activity to the combined dataset as a complete, named dataset as well as the subset from 2.

4. Appropriately labels the data set with descriptive variable names.

Note that I had already labeled the features above in 1, but renamed "subject" and "label" to "study_subject" and "activity_code", respectively, at this point for clarity.

5. From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject.

a. Use the aggregate function to group columns 4 through 82 of the data frame meanstddesc by study_subject and activity and calculate the mean

b. Finally, use dput to put the contents of the data frame tidy into a file named "tidy.txt"

getandclean's People

Contributors

tsbphd avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.