Code Monkey home page Code Monkey logo

learning-from-data-a-short-course's Introduction

This repositary holds my solutions to the exercises and problems in book "Learning from Data: A Short Course" by Yaser Abu-Mostafa et al.

Missing: Problem 1.7 (b)

Missing:

  • Exercises: 2.4
  • Problems: 2.4, 2.9, 2.10, 2.14, 2.15, 2.19

Chapter 3: The Linear Model

Missing:

  • Exercises: 3.12, 3.15
  • Problems: 3.15 (c), 3.18

Chapter 4: Overfitting

Missing:

  • Exercises: 4.9, 4.10
  • Problems: 4.4 (f), 4.21

Missing:

  • Exercises: 6.3
  • Problems: 6.5, 6.9, 6.11 (b), 6.15

Chapter 7: Neural Networks

Missing:

  • Exercises:
  • Problems: 7.2, 7.5, 7.9, 7.12, 7.16

Missing:

  • Exercises:
  • Problems: 8.9 (c)

Chapter 9: Learning Aides

Missing:

  • Exercises: 9.17
  • Problems: 9.11, 9.16, 9.17, 9.26, 9.27, 9.28

Appendix B: Linear Algebra

Appendix C: The E-M Algorithm

Questions

Chapter 1

  • What are the components of a learning algorithm?
  • What are different types of learning problems?
  • Why we are able to learn at all?
  • What are the meanings of various probability quantities?

Chapter 2

  • How do we generalize from training data?
  • How to understand VC dimension? What is the VC dimension of linear models, e.g. perceptron? What do we use it for?
  • How to understand the bounds?
  • How to understand the two competing forces: approximation and generalization?
  • How to understand the bias variance trade off? How to derive it?
  • Why do we need test set? What's the difference between train and test set? What are the advantages of using a test set?
  • What is learning curve? How to intepret it? What are the typical learning curves for linear regression?

Chapter 3

  • What are the linear models? Linear classification, linear regression and logistic regression.
  • What's the application of approximation-generalization in linear models?
  • Why minimize perceptron requires combinatorial efforts while minimize linear regression requires just analytic solution. Logistic regression needs gradient descent.
  • When can GD be used? What algorithms use gradient descent method? What use sub-gradient methods? What are the requirements for GD? What are the advantages of using SGD? What does SGD work at all? What's the convergence speed between GD and SGD?
  • Why fixed rate learning rate GD works?
  • How does feature transformation affect the VC dimension?
  • What are the advantages and disadvantages of feature transformation?
  • What is a projection matrix? Give an example of projection in 1-D or 2-D. Where does it project to?

Chapter 4

  • What is overfitting? What causes overfitting? When does it happen? How do we measure it? Why it's important? What can we do to reduce it? How is overfitting related with in-sample error, out-of-sample error etc. ? How does model complexity, number of data points, and hypothesis set affect overfitting?
  • What are the stochastic noise and deterministic noise? What're the differences betwen them? How do they show up in learning algorithm?
  • What is regularization? How does regularization affect the VC bound? What are the impacts of regularization on generalization error? What are different ways to do regularization? What are the components of regularization? Regularizer, weight decay, etc.
  • What are the Lagrange optimization? Understand it intuitively?
  • What is validation? What do we use it for? What's the relationship between validation and regularization? How are they related to the generalization error?
  • What is validation set? Why do we need re-train with all training data after we run through the validation set? Why do we need validation set?
  • What is cross-validation? When do we use it? What's the procedure to apply cross-validation?

Chapter 5

  • What is the Occam's razor?
  • How do we measure complexity? What are the two recurring theme in complexity measurement in objects?
  • What is the axim of nonfalsifibility?
  • What is sampling bias? What's the implication for VC bounds and Hoeffding bound? What are other biases we see in learning?
  • What is data snooping? How to deal with data snooping?

learning-from-data-a-short-course's People

Contributors

niuers avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.