Code Monkey home page Code Monkey logo

hdbscan's Introduction

HDBSCAN

A MATLAB implementation of the Hierarchical Density-based Clustering for Applications with Noise, (HDBSCAN), clustering algorithm.

The HDBSCAN algorithm creates a nested hierarchy of density-based clusters, discovered in a non-parametric way from the input data. The hierarchies are akin to Single Linkage Clustering, however in HDBSCAN, an optimal clustering scheme is automatically inferred from the cluster hierarchy. The optimal clustering is analogous to a single run of the DBSCAN algorithm, but with possibly varying epsilon-values (see the role of epsilon in DBSCAN) for any given branch of the hierarchy. Thus, information from local neighborhoods is used to optimally cut the hierarchy at varying levels.

This MATLAB implementation of the HDBSCAN algorithm was created with peformance in mind, and is inspired by the excellent python version. While this version is not as fast as the python implementation (in which highly optimized C code was compiled for iterating through the hierarchy), it is extremely easy to use, requires no dependencies on external toolboxes, and is currently the only MATLAB-based HDBSCAN algorithm.

See the docs for interfacing and running HDBSCAN with your own data.

You are free to use/distribute the code, but please keep a reference to this original code base and author (Jordan Sorokin).

Dependencies

  • MATLAB version r2015a or greater
  • bfs.m and mst_prim.m, courtesy of David Gleich (included in the repo)

References

  • Campello et al. (2013): Density-Based Clustering Based on Hierarchical Density Estimates.
  • Campello et al. (2015): Hierarchical density estimates for data clustering, visualization, and outlier detection

Known Issues

  • Prediction of new points is an approximation, as the cluster hierarchy is not modified with new points
  • Hierarchy update with new labels has heuristics in place to deal with new clusters arising from previously labeled outliers

hdbscan's People

Contributors

jorsorokin avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.