Code Monkey home page Code Monkey logo

machine-learning's Introduction

Machine-Learning

machine learning self-study programs

machine-learning models

Regression Regression

SONAR Rock vs. Mine Prediction with Python, by using a logistic regression model to identify the type of input whether belongs to Rock or Mine.

clustering

Hierarchical clustering

Hierarchical clustering: Cluster have a tree like structure or a parent child relationship

  • Agglomerative: Bottom up approach: Begin with each element as a separate cluster and merge them into successively larger cluster
  • Divisive: Top Down approach begin with the whole set and proceed to divide it successively smaller clusters.

Partitional clustering

  • K-Means: Division of objects into clusters such that each object is in exactly one cluster, not several
  • Fuzzy C-Means: Division of objects into clusters such that each object can belong to multiple clusters.

Distance Measure

Distance Measure: distance measure will determine the similarity between two elements and it will influence the shape the clusters

  • Euclidean distance measure: is the ordinary straight line. It is the distance between two points in Euclidean space

  • Squared Euclidean distance measure: matric uses the same equation as the Euclidean distance but does not take the square root.

  • Manhattan distance measure: is the simple sum of the horizontal and vertical components or the distance between two points along axes at right angles.

  • Cosine distance measure: similarity measures the angle between the two vectors.

  • 截屏2023-10-14 18 13 57

Decision Tree

Decision Tree is a tree shape diagram used to determine a course of action. Each branch of the tree represents a possible decision, occurrence or reaction.

Problems decision tree can solve

  • Classification: a classification tree will determine a set of logical if-then condition to classify problem. For example, discriminating between three types of flowers based on certain features.

  • Regression: Regression tree is used when the target variable is numerical or continues in nature. We fit a regression model to a target variable using each independent variables. Each splits is made based on the sum of squared error.

Advantages of Decision Tree

  • Simple to understand and interpret and visualize

  • Little effort required for data preparation

  • Can handle both numerical and categorical data

Non linear parameters don't effect its performance

Disadvantages of Decision Tree

  • Overfitting occurs when algorithm captures noise in the data

  • High variance: The model can get unstable due to small variation in data.

  • A highly complicated decision tree trends to have a low bias which makes it difficult for the model to work with new data.

Random Forest

Advantage of Random Forest

  • No overfitting:

    • Use of multiple trees reduces the risks of overfitting
    • Training time is less
  • High accuracy:

    • Runs efficiently on large databases
    • For large data, it produce highly accurate predictions
  • Estimate missing data:

    • Random Forest can maintain accuracy when a large proportion of data is missing

Random Forest or random decision forest is a method that operates by constructing multiple Decision trees during training phase. The decision of the majority of the trees is chosen by the random forest as the final decision.

Decision Tree

Decision Tree is a tree shaped diagram used to determine a course of action. Each branch of the tree represents a possible decision, occurrence or reaction.

Important Terms

Entropy: Entropy is the measure of randomness or unpredictability in the dataset.
Information gain: It is the measure of decrease in entropy after the dataset is split.
Leaf node: carries the classification or the decision
Decision node: has 2 or more branches
Root node: The top most decision node is known as the root node.

machine-learning's People

Contributors

felicia1993 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.