Code Monkey home page Code Monkey logo

credit-card-lead-prediction's Introduction

Credit Card Lead Prediction

๐Ÿ“ Description

  • This is a classification machine learning problem to identify the customers of a mid-sized bank that could show a higher intent towards a recommended credit card :credit_card:.
  • In this project we have combined the predictions made by XGBOOST and LightGBM using Stacking.

๐Ÿ“ Code

โŒ› Dataset

The dataset train.csv is used for training. The train dataset had 2,45,725 records with 11 features.
The dataset consisted the following attributes :

  • ID : Unique Identifier for a row
  • Gender : Gender of the Customer
  • Age : Age of the Customer (in Years)
  • Region_Code : Code of the Region for the customers
  • Occupation : Occupation Type for the customer
  • Channel_Code : Acquisition Channel Code for the Customer (Encoded)
  • Vintage : Vintage for the Customer (In Months)
  • Credit_Product : If the Customer has any active credit product (Home loan, Personal loan, Credit Card etc.)
  • Avg_Account_Balance : Average Account Balance for the Customer in last 12 Months
  • Is_Active : If the Customer is Active in last 3 Months
  • Is_Lead(Target) : If the Customer is interested for the Credit Card (0 : Customer is not interested , 1 : Customer is interested)

๐Ÿ“ƒ Technical Overview

The project has been divided into the following steps :

1. Exploratory Data Analysis

In this step features having missing values and outliers, target variable distribution, numerical feature distribution, categorical feature distribution, Univariate and Bivariate Analysis was performed.
Some of the data insights are given below. (For the detail EDA please refer to the ipynb notebook)

  • Customers aged between 40-60 have greater interest in credit cards whereas customers in their 20s and 30s and less interested



  • Salaried person are less likely to take up credit cards. Only among Entrepreneur the number of customers interested to take up credit cards is more. 66% of total Customers falling in Entrepreneural category in Occupation have shown interest in the past followed by 27.6% Self Employed, 24.5% in Others category and 16% Salaried.There are only 2 Entrepreneurs who don't have any credit product.



  • Number of Customers having credit products who are interested in Credit Card is more than those who donot have a Credit Product.



2. Data Cleaning

  • The Missing Value in the Credit_Product column is imputed with No_Info

3. Feature Engineering

  • The categorical features (Gender, Region_Code, Occupation, Channel_Code, Credit_Product, Is_Active) were One Hot Encoded.

4. Oversampling (Handling Class Imbalance in Target Feature)

  • About 76.27% customers are not interested in credit card, and about 23.72% are interested in credit card. To address this imbalance Oversampling techniques like SMOTE is used.

5. Modelling and Hyperparameter Tuning

  • In Modelling both LightGBM and Xgboost is used.
  • For combining the predictions made by XGBoost and Light GBM, stacking is used
  • The models are tuned using Randomized Search CV
  • To check for overfitting 5 kfold cross validation was performed

๐Ÿ“ˆ Modelling and Evaluation

  • In this project ROC-AUC score was used as evaluation metric.
  • The Xgboost model gave a ROC-AUC score of 0.879 while the LightGBM model gave a ROC-AUC score of 0.876

๐Ÿ“‹ Results

1. Feature Correlations

Age and Vintage has highest correlation (0.63) in train dataset and (0.62) in test dataset.

2. Feature Importance

XGBoost

In the XGBoost model, the top 5 features of importance are : Avg_Account_Balance, Vintage, Age, Is_Active_No and Credit_Product_Yes.

LightGBM

In the LightGBM model, the top 5 features of importance are : Vintage, Age, Occupation_Other, Avg_Account_Balance and Occupation_Self_Employed.

โœ’๏ธ Authors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.