Code Monkey home page Code Monkey logo

zakariamejdoul / customer_geolocation_data_clustering Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 1.0 3.23 MB

We use our customer geolocation data to perform a clustering algorithm to get several clusters in which the member data of each cluster are closest to each other using KMeans and Constrained-KMeans Algorithms.

License: MIT License

Jupyter Notebook 100.00%
kmeans-clustering constrained-clustering sckit-learn geolocation-data geocoding geopandas

customer_geolocation_data_clustering's Introduction

Build a Clustering Model to Perform a Customer Geolocation Data Clustering with K-Means Algorithm

Notes

Clone the project on your machine with :

git clone https://github.com/zakariamejdoul/neural_style_transfer_pytorch.git

Behaviour

What is Clustering ?

Clustering is the work to separate population or data points into various categories. Data points are closer to other data points in the same category and distinct from data points in other categories. It’s essentially a collection of objects based on their similarity and difference.

What type of problem clustering can be solved ?

Clustering algorithms are an effective Machine Learning (ML) technique for unsupervised data (unlabeled data). The most popular algorithms for ML are K-Means clustering. This algorithm is extremely efficient when applied to many ML problems.

The K-Means clustering has been applied to different scenarios in many different problems area, such as:

  • Information Technology: used to identify the spam filter, classify network traffic, and identify fraudulent or criminal activity.
  • Marketing: used to characterize & discover customer segments for marketing purposes.
  • Biology: used for classification among different species of plants and animals.
  • Insurance: used to acknowledge the customers, their policies and identifying the frauds.

Clustering for geolocation data

We are using our customer geolocation data to perform a clustering algorithm to get several clusters in which the member data of each cluster are closest to each other using KMeans and Constrained KMeans which has a parameter to restrict the number’s member of each cluster. We assume each cluster contains the parcel to which the driver should be delivered. So the driver should be travel in a certain closet area only.

This picture showed the flow process when we were dealing with geolocation data. Since we have our customer's address, we need to convert it into latitude and longitude information. We need a few steps to use the GeoPandas API, which will explain in the Geocoding section of the Notebook.

alt image

Steps

The notebook of project is divided on parts that are :

  1. Geocoding : From Address to Longitude & Latitude
  2. Import Geolocation Data
  3. K-Means Model & Training
  4. Clustering with Constrained Problem
  5. Visualization of the Result

Results

The result can be shown below :

  • Displaying KMeans Clustering Results : alt map
  • Displaying Constrained KMeans Clustering Results : alt map

Resources

Author

Zakaria Mejdoul



Enjoy Clustering and Visualizing your Customers Geolocation Data ❗ 🚀

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.