Code Monkey home page Code Monkey logo

catboost_insurance_churn_rate's Introduction

catboost_insurance_churn_rate

This repo is a complement to my other git repository where I trained models with XGBoost and lightgbm to predict churn rate of sample insurance clients. In this repo the same data set has been used to train a model with catboost.

Objectives

  1. Comparing catboost with xgboost and lightgbm
  2. Making a soft and hard voter manually, saving trained models to hard drive and applying an ensemble of trained models on test data.

Catboost

  1. Catboost has been used together with cross-validation.
  2. scikit-learn's train_test_split is used to make 70% train, 15% validation and 15% test data.
  3. A model is trained with catboost for each kfold cross-validation.
  4. f1_score is used to check model accuracy for each fold.
  5. Feature importance is calculated for all the folds together.
  6. With seaborn the features are sorted based on importance on target value and visualized.
  7. Each trained model in each fold is saved seperately with pickle.
  8. All saved models are loaded back and used to predict target value of test data set.
  9. All models' predictions are averaged together and rounded to 0 if the average value is less than equal to 0.5 and 1 if otherwise is true.

Results

  1. A single catboost performed more accurately than the ensemble of 5 catboosts each trained by a different kfold.
  2. Training a catboost model with all trained data with not split, improved the performance by testing it on the df_test data frame. The priblem with this method is, there is no way to be sure the final model is not overfit.

Impact of different encodings on catboost model

Catboost can handle categorical data and does not require encoding. Still we check the impact of different encodings on catboost.

catboost_insurance_churn_rate's People

Contributors

unideverf avatar erfanebrahimibazaz avatar

Watchers

James Cloos avatar  avatar

Forkers

arezoo-drv

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.