Code Monkey home page Code Monkey logo

leadercluster's Introduction

Java Leader Cluster v1.1

Introduction

Leader Cluster is a simple clustering algorithm and is as described in the book "Clustering Algorithms" by John A. Hartigan (pg. 74, §3.2), published by Wiley. This project is inspired by a similar implementation in R's Leader Cluster package.

In Java Leader Cluster, we have modified the original leader cluster to create clusters of fixed size in a single pass over the data points.

What's New

  • Allows clustering based on road distances, either using Google Distance API or OSRM HTTP API
  • Just specify the URL, basic Auth for OSRM (optional) and/or Google Key in config/CONFIG.ini

Input

It requires two basic inputs:-

  • Data points comprising of their coordinates and weights
  • Radius of the cluster in meters

Optionally you can also provide your own distance calculator, by default, it uses Haversine distance calculator.

Algorithm

The steps of the algorithm are:-

  • It first sorts the data points in decreasing order of their weights.
  • The first data point forms its own cluster
  • For each of the remaining points, it first checks if they can be inserted into any of the existing clusters
  • This involves checking if the point's distance from the centroid of the cluster is less than the cluster radius and
  • The resulting weighted coordinates of the cluster on inserting this point, do not result in exceeding the cluster radius for the existing members of the cluster
  • Before each iteration, all the existing clusters are sorted in decreasing order of their weights to ensure that we create clusters of greater weights

How to build

Requirements:-

  • Java 8 (Ubuntu 16.04): sudo apt-get install openjdk-8-jdk
  • Apache Maven: sudo apt-get install maven
  • Set the environment variable JAVA_HOME

After, that, clone the project to a folder and to build the project, use commands:

cd LeaderCluster
mvn clean package

How to use

There are three ways to use it:-

  • You can directly use it as a tool for clustering spatial points by using the spatialClustering package
  • You can implement the interfaces given in algorithm package and integrate Leader Cluster Algorithm into your project
  • You can use it as a standalone runnable jar to cluster points given in a input csv file.

Using runnable jar
After mvn install, a runnable jar is created in the target folder. You can use it as shown below:

java -jar target/JavaLeaderCluster-1.1.jar /path/to/input.csv <radius-of-cluster-in-meters>

Advanced Usage:

  • You can specify the distance calculator - either one of haversine, osrm or google as:
java -jar target/JavaLeaderCluster-1.1.jar /path/to/input.csv <radius-of-cluster-in-meters> <distance-calculator-name>

Usage

For a sample use case, please look at LeaderClusterTest.java

Using with another java project

Add this project as a submodule or place the jar file in the libs folder, and then include this in your project's pom.xml

<plugin>
     <groupId>org.apache.maven.plugins</groupId>
     <artifactId>maven-install-plugin</artifactId>
     <version>2.5.2</version>
     <executions>
         <execution>
             <id>install-external</id>
             <phase>clean</phase>
             <configuration>
                 <file>${basedir}/path/to/JavaLeaderCluster-1.1.jar</file>
                 <repositoryLayout>default</repositoryLayout>
                 <groupId>com.delhivery</groupId>
                 <artifactId>JavaLeaderCluster</artifactId>
                 <version>1.1</version>
                 <packaging>jar</packaging>
                 <generatePom>true</generatePom>
             </configuration>
             <goals>
                 <goal>install-file</goal>
             </goals>
         </execution>
     </executions>
</plugin>

leadercluster's People

Contributors

anurag1paul avatar suvayu avatar shikharkhattar avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.