Code Monkey home page Code Monkey logo

optics-automatic-clustering's Introduction

Automatic Clustering of Hierarchical Clustering Representations
 
Library Dependencies: numpy, if graphing is desired - matplotlib
OPTICS implementation used has dependencies include numpy, scipy, and hcluster

An implementation of the following algorithm, with some minor add-ons:
J. Sander, X. Qin, Z. Lu, N. Niu, A. Kovarsky, K. Whang, J. Jeon, K. Shim, J. Srivastava, Automatic extraction of clusters from hierarchical clustering representations. Advances in Knowledge Discovery and Data Mining (2003) Springer Berlin / Heidelberg. 567-567
available from http://dx.doi.org/10.1007/3-540-36175-8_8

Implemented in Python by Amy X. Zhang, while at Cambridge Computer Laboratory in March 2012. Now at MIT.
[email protected]
http://people.csail.mit.edu/axz

This algorithm assumes you have already obtained a hierarchical clustering representation  of your data, through OPTICS or other means. The clustering algorithm I use is OPTICS and an implementation in python can be found here: http://chemometria.us.edu.pl/download/optics.py. See the demo on how to use the OPTICS code with this.

OPTICS is a hierarchical clustering algorithm for finding density-based clusters in spatial data. It returns a list of points linearly ordered so that spatially close points are neighbors as well as an associated reachability value for every point. This makes up a reachability plot, a special kind of dendrogram. For this algorithm, the input is a reachability plot. If necessary, dendrograms can be converted to reachability plots and vice versa, as outlined in the above paper.
 
The algorithm takes in a reachability plot and list of points ordered by OPTICS. It returns the root node of the hierarchical clustering tree created by finding the local maximas in the reachability plot and choosing whether to use or discard them. Please see the paper for a more detailed description. There are a few changes in the algorithm given in the paper, and they are noted in the code. For functionality, I have added helper functions to print out, graph, write to file, or extract only leaves from the tree. See demo for examples on using them.

This code was used for the following publication (Bibtex format):

@inproceedings{zhang2013hoodsquare,
  title={Hoodsquare: Modeling and recommending neighborhoods in location-based social networks},
  author={Zhang, Amy X and Noulas, Anastasios and Scellato, Salvatore and Mascolo, Cecilia},
  booktitle={Social Computing (SocialCom), 2013 International Conference on},
  pages={69--74},
  year={2013},
  organization={IEEE}
}

optics-automatic-clustering's People

Contributors

amyxzhang avatar flyingdorothia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.