Code Monkey home page Code Monkey logo

businessanalyticscourse's Introduction

BusinessAnalyticsCourse

This is a repository to organize the teaching material for a new course at Monash University, to be jointly taught by Di Cook, Rob Hyndman and Souhaib Ben Taieb.

M 1-2 E365; T 4-6 E157; Th 11-12 H7

Topics

  • What is statistical learning?
  • Types of models: regression, classification, unsupervised
  • Model estimation: numerical optimization
  • Model assessment and selection: predictive accuracy
  • Resampling methods: bootstrap, permutation, cross-validation, bagging
  • Regression: splines, additive models, ridge regression, variable selection, lasso
  • High dimension, low sample size: principal components, partial least squares, regularization
  • Supervised classification: logistic regression, neural networks, trees, forests, support vector machines, ensembles
  • Unsupervised classification: hierarchical clustering, k-means, self-organizing maps, model-based clustering
  • Cleaning, exploring and visualizing data

Tentative agenda

  • Week 1. (Jul 27) Introduction to business analytics and R (Rob & Souhaib)

    • Lecture 1: What is Business Analytics?
    • Lab 1: R exercises
    • Lecture 2: Show case of R.
  • Week 2. (Aug 3) Statistical Learning. Ch2. (Rob & Souhaib)

    • Lecture 3: More on R and statistical learning.
    • Lab 2:
    • Lecture 4: Assessing model accuracy. Bias-variance tradeoff

    Content:

    • The four V’s of big data/data science
    • Analytics and data science jobs: “By 2018, the US could face a shortage of up to 190.000 workers with analytical skills” McKinsey
  • Week 3. (Aug 10) Regression for prediction. Ch3 (Rob) [Souhaib away]

    • Lecture 5: Review of linear regression, matrix formulation
    • Lab 3:
    • Lecture 6: Subset selection, LOOCV
  • Week 4. (Aug 17) Resampling. Ch5 (Rob)

    • Lecture 7: Cross-validation
    • Lab 4:
    • Lecture 8: Bootstrap
  • Week 5. (Aug 24) Dimension reduction. Ch6,10. (Rob and Souhaib)

    • Lecture 9: PCA
    • Lab 5:
    • Lecture 10: PLS
  • Week 6. (Aug 31) Visualization. Own lecture notes. (Di)

    • Lecture 15: Basic data plots, categorical/numeric variables, facetting
    • Lab 8:
    • Lecture 16: Data cleaning
  • Week 7. (Sep 7) Visualization. Own lecture notes. (Di)

    • Lecture 17: Plotting geographic data
    • Lab 9:
    • Lecture 18: Plotting multivariate data
  • Week 8. (Sep 14) Classification. Ch4,8 (Souhaib & Di)

    • Lecture 11: LDA
    • Lab 6:
    • Lecture 12: Trees
  • Week 9. (Sep 21) Classification. Ch4,9. (Souhaib) [Di away Sep 23-25]

    • Lecture 13: SVM
    • Lab 7:
    • Lecture 14: k-NN
  • [Break Sep 28-Oct 4]

  • Week 10. (Oct 5) Advanced classification. Ch8. (Di) [Rob and Souhaib away]

    • Lecture 19: Bagging, random forests
    • Lab 10:
    • Lecture 20: Boosting
  • Week 11. (Oct 12) Advanced regression. Ch6. (Di & Souhaib) [Souhaib away?]

    • Lecture 21: Regularization
    • Lab 11:
    • Lecture 22: Shrinkage
  • Week 12. (Oct 19) Clustering. Ch10. (Souhaib & Di) [Souhaib away?, Rob away Oct 22-]

    • Lecture 23: k-means
    • Lab 12:
    • Lecture 24: hierarchical clustering

Textbook:

James, Witten, Hastie, Tibshirani, An Introduction to Statistical Learning and Applications in R, Springer, 2013 (pdf available at http://www-bcf.usc.edu/~gareth/ISL/)

Additional Reading:

Software:

  • R/RStudio (these packages with all the dependencies)

    • ggplot2, scales, stringr, plyr, dplyr, tidyr, reshape2, GGally, dichromat, magrittr, munsell, RColorBrewer, colorspace, wordcloud, vcd, gridExtra, hexbin, ggdendro, shiny, ggvis, ggsubplot
    • MASS, e1071, caret, randomForest, kohonen, cluster, fpc, mclust, rpart, nnet, nlme, vegan, penalizedLDA, PPTree, class, FNN, RSNNS, DMwR
    • mvtnorm, mvnormtest, HH, ICSNP, matrixStats, schoRsch, Matrix, psych
    • lubridate, tm, tuneR, caTools, maps, ggmap, maptools, shapefiles, sp, rworldmap
    • rmarkdown, knitr, devtools, roxygen2, profr
    • gwidgets, RGtk2, MissingDataGUI
    • foreign, jsonlite, curl, RCurl, rvest, Rcpp, XML, twitteR
    • xtable
    • classifly, clusterfly, meifly, LCA, LDAvis, nullabor
    • rggobi (requires ggobi software installed on the computer): installation should be easy on Windows machines
  • Weka ? -- perhaps just mention that software other than R exists. It is too much to get them to learn more than one language in this unit.

Approach to teaching R?

Use Rmarkdown lecture notes like http://dicook.github.io/stat585/, and this should help students get comfortable with R themselves. (Ignore the ugly web site design!)

and material at this site: http://www.stat.iastate.edu/ccgs/short-courses/

Rmarkdown slides

Use RStudio, create a new project in your git directory (BusinessAnalyticsCourse), and open the slides/1-intro.Rmd.

On the interface you should see a "knitHTML" button. Click this to compile and preview the slides.

businessanalyticscourse's People

Contributors

bsouhaib avatar dicook avatar robjhyndman avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.