Code Monkey home page Code Monkey logo

ensamble-learning's Introduction

Open In Colab

Applied Machine Learning Cascading Meta-learner

Table of contents

  • Importing Libaries
  • Data pre-processing
  • Decision Tree Base Learner
  • StackBoost-Cascading

Background

Cascading, according to Google, in simple English literature means "a process whereby something, typically information or knowledge, is successively passed on". Cascading is one of the most powerful ensemble learning algorithm which is used by Machine Learning engineers and scientists when they want to be absolutely dead sure about the accuracy of a result. For example, suppose we want to build a machine learning model which would detect if a credit card transaction is fraudulent or not. If you think about it, it's a binary classification problem where a class label 0 means the transaction is not fraud & a class label 1 means the transaction is fraudulent. In such a model, it's very risky to put our faith completely on just one model. So what we do is build a sequence of models (or a cascade of models) to be absolutely sure about the fact that the transaction is not fraudulent. Cascade models are mostly used when the cost of making a mistake is very very high. I will try to explain cascading with the help of a simple diagram.

1

Use a simple DecisionTreeClassifier as your base classifier. Make each successive model more complex by increasing the maximum_depth of the tree starting from a depth of a single split.

2

Use a confidence threshold of 0.95 and cascading depth of 15.

3

What changes would you make to the algorithm/implementation to improve performance?

Our Approuch - StackBoost Cascading

1. Change from Tree base classifiers to Boosting classifiers (e.g. XGBoost, AdaBoost and GradientBoosting)

2. Tune more hyperparameters when cascading (e.g. max_depth, n_estimators and learning_rate)

3. Evaluate using Cross-Validation Procedure

4. Use Ensemble Voting - Majority Role (e.g. hard or soft voting)

Boosting

The second ensemble technique that we are going to discuss today is called Boosting. Boosting, in general, is used to convert weak learners to strong ones. Weak learners are basically classifier which has a very weak correlation with the true class labels and strong learners are classifiers that have a very high correlation between the model and the true class labels. Boosting involves training the weak learners iteratively, each trying to correct the error made by the previous model. This is achieved by training a weak model on the whole training data, then building a second model which aims at correcting the errors made by the first model. Then we build a third model which will correct the errors made by the second model and so on.

The EnsembleVoteClassifier is a meta-classifier for combining similar or conceptually different machine learning classifiers for classification via majority or plurality voting. (For simplicity, we will refer to both majority and plurality voting as majority voting.)

The EnsembleVoteClassifier implements "hard" and "soft" voting. In hard voting, we predict the final class label as the class label that has been predicted most frequently by the classification models. In soft voting, we predict the class labels by averaging the class-probabilities (only recommended if the classifiers are well-calibrated).

4

Evaluate the predictive performance using the log-loss metric.

⚠️ Prerequisites

📦 How To Install

You can modify or contribute to this project by following the steps below:

1. Clone the repository

  • Open terminal ( Ctrl + Alt + T )

  • Clone to a location on your machine.

# Clone the repository 
$> git clone https://github.com/serfati/ensamble-learning.git  

# Navigate to the directory 
$> cd ensamble-learning

2. Install Dependencies

# install with pip/conda 
$> pip install -r requirments.txt

3. launch of the project

# Run nootebook 
$> jupyter notebook ensemble-learning.ipynb
  • Or open with Colab

    Open In Colab


author Serfati

⚖️ License

This program is free software: you can redistribute it and/or modify it under the terms of the MIT LICENSE as published by the Free Software Foundation.

⬆ back to top

ensamble-learning's People

Contributors

serfati avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.