Code Monkey home page Code Monkey logo

ec524w20's Introduction

EC 524, Winter 2020

Welcome to Economics 524 (424): Prediction and machine-learning in econometrics, taught by Ed Rubin and Connor Lennon.

Schedule

Lecture Tuesday and Thursday, 10:00am–11:50am, 105 Peterson Hall

Lab Friday, 12:00pm–12:50pm, 102 Peterson Hall

Office hours

  • Ed Rubin (PLC 519): Thursday (2pm–3pm); Friday (1pm–2pm)
  • Connor Lennon (PLC 430): Monday (1pm-2pm)

Syllabus

Syllabus

Books

Required books

Suggested books

Lecture notes

000 - Overview (Why predict?)

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

001 - Statistical learning foundations

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

002 - Model accuracy

  1. Model accuracy
  2. Loss for regression and classification
  3. The variance bias-tradeoff
  4. The Bayes classifier
  5. KNN

Formats .html | .pdf | .Rmd

003 - Resampling methods

  1. Review
  2. The validation-set approach
  3. Leave-out-out cross validation
  4. k-fold cross validation
  5. The bootstrap

In-class: Validation-set exercise (Kaggle)

Formats .html | .pdf | .Rmd

004 - Linear regression strikes back

  1. Returning to linear regression
  2. Model performance and overfit
  3. Model selection—best subset and stepwise
  4. Selection criteria

Formats .html | .pdf | .Rmd

005 - Shrinkage methods

  1. Ridge regression
  2. Lasso
  3. Elasticnet

Formats .html | .pdf | .Rmd

006 - Classification intro

  1. Introduction to classification
  2. Why not regression?
  3. But also: Logistic regression
  4. Assessment: Confusion matrix, assessment criteria, ROC, and AUC

Formats .html | .pdf | .Rmd

007 - Decision trees

  1. Introduction to trees
  2. Regression trees
  3. Classification trees—including the Gini index, entropy, and error rate

Formats .html | .pdf | .Rmd

008 - Ensemble methods

  1. Introduction
  2. Bagging
  3. Random forests
  4. Boosting

Formats .html | .pdf | .Rmd

009 - Support vector machines

  1. Hyperplanes and classification
  2. The maximal margin hyperplane/classifier
  3. The support vector classifier
  4. Support vector machines

Formats .html | .pdf | .Rmd

Projects

Intro Predicting sales price in housing data (Kaggle)

Help: Kaggle notebooks

001 KNN and loss (Kaggle notebook)
You will need to sign into you Kaggle account and then hit "Copy and Edit" to add the notebook to your account.
Due 21 January 2020 before midnight.

002 Cross validation and linear regression (Kaggle notebook)
Due 04 February 2020 before midnight.

003 Model selection and shrinkage (Kaggle notebook)
Due 13 February 2020 before midnight.

004 Predicting heart disease (Kaggle competition) | Competition Due 20 February 2020 before midnight.

005 Classifying customer churn (Kaggle competition) | Competition Due In-class 27 February 2020.

Class project Due 12 March 2020 before class.

Lab notes

000 - Workflow and cleaning

  1. General "best practices" for coding
  2. Working with RStudio
  3. The pipe (%>%)

Formats .html | .pdf | .Rmd

001 - dplyr and Kaggle notebooks

  1. Finish previous lab on dplyr
  2. Working in (Kaggle) notebooks
  3. Kaggle contest notes

002 - Cross validation and simulation

  1. Cross-validation review
  2. CV and interdependence
  3. Writing functions
  4. Introduction to learning via simulation
  5. Simulation: CV and dependence

Formats .html | .pdf | .Rmd

Additional R script for simulation

003 - Data cleaning and dplyr

004 - Data cleaning and workflow with tidymodels

005 - Perceptrons and neural nets

Additional Data cleaning in R (with caret)

  • Converting numeric variables to categorical
  • Converting categorical variables to dummies
  • Imputing missing values
  • Standardizing variables (centering and scaling)

Additional resources

R

Data Science

Spatial data

ec524w20's People

Contributors

edrubin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.