Code Monkey home page Code Monkey logo

ensemble_methods_averaging's Introduction

ensemble_methods_averaging

Calculating math grades at third semester by multi features.

Feature Selection

Using Pearson correlation matrix for feature selection

The correlation coefficient has values between -1 to 1

  • A value closer to 0 implies weaker correlation (exact 0 implying no correlation)
  • A value closer to 1 implies stronger positive correlation
  • A value closer to -1 implies stronger negative correlation

Mote about feature selection is available at link. There are 32 features to predict math grades of student on 3rd semester. By using Pearson correlation matrix, the number of features reduced to 9 which are as follows:

'sex', 'Medu', 'reason', 'traveltime','failures', 'romantic', 'goout', 'G1', 'G2'

The heatmap of all fields is as follows:

heatmap of all fields

Independent variables need to be uncorrelated with each other. If these variables are correlated with each other, then we need to keep only one of them and drop the rest.

Any two features that are correlated above 0.x (absolute value), are considered as dependent. Only keep the one that has higher correlation with target value G3. The heatmap together with features_corr matrix are the basis for independent features selection.

After feature selection by choosing any field with correlation higher than 20%, the result is as follows:

heatmap of selected features

Making regression models:

8 different regression models have been used including the following:

  • Linear regression
  • Ridge regression
  • Logistic regression
  • Random forest with 20 tree
  • Random forest with 200 trees
  • Random forest with 120 trees and max depth 3
  • Random forest with200 trees with max depth 5
  • Gaussian Bayes

At the end all the out puts are averaged together.

Conclusion

Averaging the output of all 8 algorithms imrpoved prediction accuracy as follows:

  • Mean Absolute Error: 1.1779733908905832
  • Mean Squared Error: 3.970864130199477
  • Root Mean Squared Error: 1.992702719975932

ensemble_methods_averaging's People

Contributors

unideverf avatar erfanebrahimibazaz avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.