Code Monkey home page Code Monkey logo

manjit-baishya-datascience / raisin-variety-prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.06 MB

The "Raisin Variety Prediction" project involves data analysis, cleaning outliers, preprocessing data, and training machine learning models for predicting raisin varieties. The best-performing model is chosen based on accuracy scores for deployment.

Home Page: https://www.kaggle.com/code/manjitbaishya001/raisins-variety-prediction-94-accuracy

Jupyter Notebook 100.00%
classification-algorithm data-science machine-learning machine-learning-algorithms

raisin-variety-prediction's Introduction

Raisin Variety Prediction

Header Machine Learning Classification Data Visualization Modelling

Table of Contents

Importing Data

We begin by importing necessary libraries and loading our dataset.

Analysing Data

We inspect the dataset by checking its dimensions, information, and statistical summaries. Visualizations like KDE plots, pairplots, and heatmaps help us understand the data's distribution and correlations.

Data Cleaning

We address data cleaning tasks such as handling outliers and replacing null values.

  1. Cleaning Outliers:
    We identify and replace outliers in the dataset to ensure they don't skew our analysis or modeling.

  2. Replacing Outliers with Bounds:
    Outliers are replaced with upper and lower bounds to bring them within an acceptable range.

  3. Checking Boxplots after Removing Outliers:
    We visualize boxplots post outlier removal to confirm that the data is now cleaned.

Data Pre-Processing

We prepare the data for modeling tasks by encoding categorical features and splitting it into training and testing sets.

  1. Encoding Class:
    We use label encoding to transform the target variable into numerical format.

Machine Learning

We experiment with various machine learning models to identify the best performer for our dataset.

  1. Splitting Data:
    We split the dataset into training and testing sets to facilitate model training and evaluation.

  2. Standardizing Data:
    Standardization is applied to ensure all features have a similar scale, which aids model performance.

  3. Training Different Models
    Several machine learning classifiers are trained and compared based on their accuracy scores.

Results and Conclusion

We summarize the performance of each model and select the best-performing one for predicting raisin varieties.

  • Model Performance Comparison

An overview of the accuracy scores of different machine learning models.

Conclusion

Based on the evaluation metrics, the best-performing classifier is identified and can be used for further analysis or deployment.

THANK YOU

raisin-variety-prediction's People

Contributors

manjit-baishya-datascience avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.