Code Monkey home page Code Monkey logo

clodhopper's Introduction

ClodHopper: A High Performance Java Library for Data Clustering

MISSION STATEMENT:

ClodHopper is a open source Java library for high-performance clustering of numerical data.
It contains clustering implementations such as K-Means, K-Means++, X-Means, G-Means, Fuzzy C-Means, and various forms of hierarchical clustering. ClodHopper's clustering implementations take advantage of the host system's concurrent processing ability in order to speed up clustering. The data structures are also very lean in order to conserve on memory usage. ClodHopper is also very extensible. If you are developing a new clustering algorithm, you may save yourself an enormous amount of work by extending a ClodHopper base class.

LICENSING:

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

PRIMARY CONTACT:

Randall Scarberry, email: [email protected]

How to get started with ClodHopper:

  1. If you want to download and browse the code, use git as follows:

    git clone https://github.com/rscarberry-wa/clodhopper.git

In the newly-created clodhopper directory, you will find the subdirectories clodhopper-core and clodhopper_examples. The first contains a maven project for clodhopper proper. The second contains a project of numerous examples. I recommend importing both projects into eclipse or the IDE of your choice.

  1. If you simply want to use ClodHopper to cluster something in one of your programs, just place this dependency into your maven pom.xml:
<dependency>
  <groupId>org.battelle</groupId>
  <artifactId>clodhopper-core</artifactId>
  <version>1.0.0</version>
</dependency>
  1. The simplest example shows you how to use k-means to cluster a csv file containing numeric data. The example is contained in the file:

    org/battelle/clodhopper/examples/kmeans/SimpleKMeansDemo.java

This file is generously commented.

  1. Also check out the following demos:

    org.battelle.clodhopper.examples.multiple.GeneratedDataPanel

    This example runs several of the clustering algorithms in sequence on generated data. As they complete, it display scatter plots with the clusters collapsed into 2 dimensions. You can drag your mouse to select clusters and points in any of the plots and the selections propagate to the other plots, indicating how the clusters correspond.

    org.battelle.clodhopper.examples.ui.ClodHopperUI

    This example permits you to read in a csv data file and cluster the data using many of the algorithms in the library using just about any parameter setting you please. Then you can save the clustering results in a simple csv file.

  2. Watch for more on the wiki! ClodHopper is just getting started.

clodhopper's People

Contributors

rscarberry-wa avatar

Watchers

Tianlu Wang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.