Code Monkey home page Code Monkey logo

chateau-efcharpe's Introduction

Chateau Efcharpe

This mini-workshop explores some machine learning ideas via the Wine Quality dataset from the University of California Irvine Machine Learning Repository:

https://archive.ics.uci.edu/ml/datasets/Wine+Quality

The goal is to predict the quality of a wine, based on 11 chemical measurements available. To do this, we will use ideas from Ensemble Methods, an area of Machine Learning focused on combining weak prediction models into good ones.

How to use this

The workshop is organized in 3 independent scripts, Chapter 1, 2 and 3. The scripts can be run from the F# scripting environment, without having to build anything.

Before diving into the first script, chapter-1.fsx, you will need to install 2 dependencies, fsharp.data and xplot. To do this, you can run paket, or simply run install.cmd or install.sh, depending on your operating system.

Once this is done, go to workshop/chapter-1.fsx, and start following the steps, running the code as you go. Each script contains tutorials, introducing some F# syntax, as well as sections marked [TODO], where you will have to write some code.

The folder solved/ contains a possible solution for each of the scripts, in case you get stuck.

Once you are done

The core ideas in this workshop are:

  1. using Stumps as prediction models
  2. using a Cost function to compare models
  3. analyzing Residuals to improve a model, and combining "weak" Stumps into a better model
  4. splitting data between training and validation to evaluate the quality of a model

Put together, these ideas give us a fairly general algorithm, which can be seen as a crude version of a more general approach, called Gradient Boosting: https://en.wikipedia.org/wiki/Gradient_boosting

Gradient Boosting takes these ideas further, by

  1. generalizing this to arbitrary models, and not just stumps,
  2. generalizing this to arbitrary cost functions, and not just square-of-errors.

... and putting them in the context of a general method called gradient descent.

If you are interested in seeing how this might look like, you may find this blog series interesting:

http://brandewinder.com/2016/08/06/gradient-boosting-part-1/ http://brandewinder.com/2016/08/14/gradient-boosting-part-2/ http://brandewinder.com/2016/09/03/gradient-boosting-part-3/

chateau-efcharpe's People

Contributors

mathias-brandewinder avatar klettier avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.