Code Monkey home page Code Monkey logo

k-means-without-libraries's Introduction

K-Means-without-ML-libraries

image

Dataset Description

Provided dataset consists total of 150 samples divided into two files irirs_train.csv and irirs_test.csv having 130 and 20 samples, respectively. As data set is iris flower, I assumed the column names as:

Column 1 - Sepal Length in cm Column 2 - Sepal Width in cm Column 3 - Petal Length in cm Column 4 - Petal Width in cm Column 5 – Species: Iris-Setosa, Iris-Versicolor and Iris-Virginica

Findings

  1. Data Cleaning and normalization: No Missing value or Null value found in input dataset. Calculated min-max normalization scaler to normalize data before passing to algorithm.
  2. Correlation Analysis: Outcomes of Correlation analysis: • Setosa petal lengths and widths are much smaller than Versicolor and Virginica. • Strong linear relationship between all the variables except sepal width, which is much weaker and negative. The below table identifies trends between variables. Depending on strength of the relationship, it assigns a number between -1 and 1.•

Looking at the below correlation table, we can see that there are 3 main variables (sepal length, petal length and petal width) that have a strong linear relationship with species_id. These variables are likely to be strong variables in predicting the species of a given data.

image

image

k-means-without-libraries's People

Contributors

sagardatascientists avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.