Code Monkey home page Code Monkey logo

richasingh-92 / machine-learning-pipelines-with-azure-ml-studio Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 181 KB

The aim of this project is to check if an adult has income exceeding 50k per year based on the census data.For the following project, I have used the Decision Tree (Booster) algorithm to create and train the model. The model is available as an azure model web service.

Home Page: https://www.coursera.org/account/accomplishments/certificate/4WRD84S37AE4 [certificate link]

azuremachinelearning binaryclassification data-science machinelearning

machine-learning-pipelines-with-azure-ml-studio's Introduction

Machine-Learning-Pipelines-with-Azure-ML-Studio

Project Description : Using freely available datasets from UCI Machine learning repository on Adult census, I have tried to create a Machine Learning Model, that will predict how much is the income of a dataset. Using independent features like race, sex, age, capital loss, hours-per-week, native country, and others, I bionomically categories any individual if he falls above >=50k income pool, or <50k income pool. I do pre-processing on the available open dataset, do feature engineering, transform the label datatype, and use Decision Tree classifier to train the model , further the final result is deployed as web service.

Problem Statement : Often, financial institutions need to categorize their customers based on their income, however, generic features like age, work-sector, education, are not enough to predict their income. The Adult Census dataset available in UCI ML repository, uses 14 features to predict an individual's income, thus helping in binomial classification of individuals based on their income. The result is useful in accurate service recommendation. Individuals with >= 50k income fall in medium wage income and thus can be segmented to respective promotional services, while those with < 50k income, can be provided with low cost services. Those above >=50k income are eligible for expensive offers and promotional memberships for clubs, credit cards, etc. while those with income <50k, shall be targeted with more budget constrained offers. The categorization can also help in predicting the life-style of the individual.

The project is done in 6 steps:

-Introduction Dataset cleaning and Pre-processing -Accounting for class imbalance -Training the ML model- Two-Class Boosted Decision tree -Hyper parameter tuning Scoring and Evaluating the models -Publishing the trained model as a Web Service

The aim of this project is to build an end-to-end Machine Learning (ML here after) project using the Azure Machine Learning Studio Visual interface which is a no-code platform.

Dataset discussion:

The dataset provided to us contains 32560 rows and 14 different independent features. We are trying to predict whether a person will earn more than $50,000 a year or not. Since the data predicts 2 values (>50K or andlt;=50K), this is clearly a classification problem and we train the classification models to predict the desired outputs.

DATA CLEANING AND ACCOUNTING FOR CLASS IMBALANCE :

DATA CLEANING AND ACCOUNTING FOR CLASS IMBALANCE

TRAINING TO MODELS :

TRAINING TO MODELS

Training a Two-Class Boosted Decision Tree Model :

Training a Two-Class Boosted Decision Tree Model

Scoring and Evaluating the Models :

Scoring and Evaluating the Models

Publishing the Trained Model as a Web Service for Inference :

Publishing the Trained Model as a Web Service for Inference

machine-learning-pipelines-with-azure-ml-studio's People

Contributors

richasingh-92 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.