Code Monkey home page Code Monkey logo

hw-2's Introduction

Homework 2

You are currently in the GitHub repository (repo) for HW-2. You must have completed all the steps in Setting Up.

Learning Goals

  • You will perform an Exploratory Data Analysis (EDA) using visualizations and tables. What do I mean by EDA? As explained here, it is used as a preliminary analysis tool for
    • the detection of mistakes
    • checking of assumptions
    • preliminary selection of appropriate models
    • determining relationships among the explanatory variables, and
    • assessing the direction and rough size of relationships between explanatory and outcome variables.
  • Loosely speaking, any method of looking at data that does not include formal statistical modeling and inference falls under the term EDA. It is just as, if not more, important as an actual analysis.
  • Working with categorical variables, especially when the cateogories are super messy.
  • Answering open-ended questions with no explicit "finish line". I have set some boundaries however so that you can't get too lost.

Homework

  1. Follow the same workflow as in HW-0 for HW-2.
  2. Do not submit a HW-2.Rmd file that does not knit.
  3. I anticipate you spending between 8-12 total (across all submissions) on this homework.

Data

Result of a Python script that scraped the OkCupid website. We consider 59K users who were:

  • members on 2012/06/26
  • within 25 miles of SF
  • online in the last year
  • have at least one photo

Their public profiles were pulled on 2012/06/30. i.e. only data that’s visible to the public. Also included are essay question response data in variables essay0 through essay9. The codebook for the data is available here. Note:

  • Usernames were excluded from data set. However, this is not a guarantee of de-anonymization.
  • Permission to use this data was granted from OkCupid.

Run the following code once to download the OkCupid data from GitHub into your HW-2 RStudio project i.e. into your HW-2 folder. After you run this, there should be a file profiles.csv in the same directory as all your HW-2 files.

url <- "https://github.com/rudeboybert/JSE_OkCupid/blob/master/profiles.csv.zip?raw=true"
temp_zip_file <- tempfile()
download.file(url, temp_zip_file)
unzip(temp_zip_file, "profiles.csv")

Tips

hw-2's People

Contributors

rudeboybert avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.