Code Monkey home page Code Monkey logo

cpfcluster's Introduction

CPFcluster

An implementation of the Component-wise Peak-Finding (CPF) clustering method, presented in 'A Theoretical Analysis of Density Peaks Clustering and the CPF Algorithm'.

Dependencies

CPFcluster supports Python 3, with numpy, scipy, itertools, multiprocessing and scikit-learn. These should be linked with a BLAS implementation (e.g., OpenBLAS, ATLAS, Intel MKL).

Installation

CPFcluster is available on PyPI, the Python Package Index.

$ pip install CPFcluster

How To Use

To use CPFcluster, first import the CPFcluster module.

    from CPFcluster import CPFcluster

Clustering a Dataset

A CPFcluster object is constructed using the fit method, which returns a clustering of a dataset.

    CPF = CPFcluster(k, rho, alpha, n_jobs, remove_duplicates, cutoff)
    CPF.fit(X)

CPFcluster takes 6 arguments:

  • k Number of nearest-neighbors used to create connected components from the dataset and compute the density.
  • rho (Defaults to 0.4) Parameter used in threshold for center selection.
  • alpha (Defaults to 1) Optional parameter used in threshold of edge weights for center selection, not discussed in paper.
  • n_jobs (Defaults to 1) Number of cores for program to execute on.
  • remove_duplicates (Defaults to False) Option to remove duplicate rows from data in advance of clustering.
  • cutoff (Defaults to 1) Threshold for removing instances as outliers. Instances with fewer edges than the cutoff value are removed.

The CPFcluster object is then fit to a dataset:

  • X An n-by-d numpy.ndarray with training data. The rows correspond to n observations, and the columns correspond to d dimensions.

The result object further contains:

  • CCmat An n-by-n sparse matrix representation of the k-NN graph.
  • components A vector containing the index of the component to which each instance belongs. If the instance is an outlying point, the value will be NaN.
  • labels_ The final cluster labelings.

Experimentation

To replicate the experiments in the original paper:

python3 run_CPF.py

CPFcluster

CPFcluster has an MIT License.

See LICENSE.

cpfcluster's People

Contributors

tobinjo96 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.