Code Monkey home page Code Monkey logo

davissml's Introduction

UC Davis Statistics 208 : Statistical Machine Learning

A Course on the Principles of Statistical Machine Learning with Examples in Python

Machine learning is how to get computers to automatically learn and improve with experience. Experience comes in the form of data, improvement is with respect to some performance metric, and learning is done by a learning algorithm. There are always computational constraints, such as the architecture, computation time, bandwidth limitations, and so on. So we can more precisely restate the goal thus: to construct learning algorithms that use data to improve with respect to a performance metric and do so under computational constraints.

We will focus on principles of statistical machine learning in the prediction problems, regression and classification. Conspicuously absent is any Bayesian methodology, hidden Markov models, unsupervised learning including density estimation, clustering, dimension reduction, network modelling, etc. This course is not a broad overview of all of machine learning, but rather a tour of the key ideas in machine learning as told through these prediction tasks. Typically, I have students tell me something along the lines of "I thought machine learning was about [insert random methodology here]". Machine learning is a field, like physical chemistry or creative literature. It is not defined by a couple of methods or a single task, and cannot be taught in a single quarter. With that said, I want this course to lay the foundation for a rich understanding of machine learning.

Instructions

  1. Class time will be for lectures and labs. These will be posted in advance in their sections. Labs are not graded, and you will have an opportunity to go over your results in class.
  2. Homework will be due every week or so, starting with the second week. (50% of grade)
  3. Discussions and questions should be posted on the slack site.
  4. A final project will be due at the end of the class. Instructions can be found here. (50% of grade)
  5. You will need to install Python and the necessary packages to participate in this class. See the following instructions for the installation.

Office Hours

  • James: Wednesdays 9am-12pm in MSB 4107
  • Chunzhe: Monday 1-3pm in MSB 1117
  • Huang: Friday 1-3pm in MSB 1117

Textbooks

Homeworks

Syllabus

Introduction to Machine Learning

Principles: Over/under-fitting, training and testing, losses
Reading: ESL Chapter 2

4/4Lecture 1Introduction to machine learning and Python I
4/6Lecture 2Introduction to machine learning and Python II, Lab 1

Regression (beyond Ordinary Least Squares)

Principles: Convex relaxation, computational intractability in subset selection
Reading: ESL Chapter 3, Boyd Chapter 1

4/11Lecture 3Subset selection and ridge regression
4/13Lecture 4Convex optimization, Lab 2, Solutions
4/18Lecture 5The Lasso, Lab 3, Solns

Classification

Principles: Surrogate losses, generative and discriminative methods
Reading: ESL Chapter 4

4/20Lecture 6Logistic regression
4/25Lecture 7Classification and Generative Methods, Lab 4
4/27Lecture 8Max-margin Methods

Unsupervised Learning

Principles: Compression, Dimension Reduction Reading: ESL Chapter 14

5/2Lecture 9Clustering (notes)
5/4Lecture 10Dimension Reduction (notes)

Basis Expansion and Kernels

Principles: Feature extraction, the kernel trick, analysis/sythesis duality
Reading: ESL Chapter 5

5/9Lecture 11Basis expansion and hi-di embeddings
5/11Lecture 12Kernels

Resampling, Trees, and Aggregation

Principles: Interpretable models, statistical complexity, learning from experts
Reading: ESL Chapters 7, 8

5/16Lecture 13Bootstrap and cross validation
5/18Lecture 14Trees, generalized additive models
5/23Lecture 15Boosting and Random Forests

Online Learning, Neural Networks, and Deep Learning

Principles: multi-layer architectures, non-convex optimization, online algorithms

5/25Lecture 16Stochastic gradient and online learning
5/30Lecture 17Neural Networks and Backpropogation
6/1Lecture 18Theano with notebooks

Graphical Models

Principles: Markov network, dynamical models

6/6Lecture 19Understanding dependence with graphical models
6/8Lecture 20Hidden Markov models

Repository Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── labs               <- notebook files for course labs (6 labs)
│
├── homeworks          <- notebook files for homeworks (6 hws)
│
├── lectures           <- notebook files for lectures (naming convention 
│
├── references         <- Reference material, pdfs, etc.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.testrun.org

Project based on the cookiecutter data science project template. #cookiecutterdatascience

davissml's People

Contributors

jsharpna avatar

Watchers

James Cloos avatar Ying-Chen Chou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.