Code Monkey home page Code Monkey logo

zentas's Introduction

ZENTAS

A C++ and (optional) Python tool for partitional clustering. Optimised implementations of K-Medoids and K-Means, for various data types. More information is in our paper at arXiv 1609.04723.

K-Medoids a.k.a. K-Centers

Given N elements x(1)...x(N), select K elements indexed by c(1)...c(K), to minimise sum(i=1...N) min(k=1...K) E(distance (x(i), x(c(k)))) where distance is a valid distance and E is a non-decreasing function with E(0) = 0.

distance options are

  • for sparse and dense vectors : l-0, l-1, l-2, l-infinity
  • for sequence data : Levenshtein and Normalised Levenshtein.

Energy E options are

  • identity, quadratic, cubic, square-potential, exponential, and logarithmic.

K-Means for dense and sparse vector data

  • minimise sum of squares of l2 distances to cluster mean
  • minimise sum of l1 distances to cluster dimension-wise median

PREREQUISITES

  • CMake
  • for the Python library: Cython and Python

CONFIGURE WITH CMAKE

Create a build directory:

mkdir build; cd build;

If you do NOT want the Python library,

cmake -DBUILD_PYTHON_LIB=NO ..

If you do want the Python library,

cmake ..

BUILD

The library can be built, from the build directory

make -j5

The shared library should now be in ./build/zentas (libzentas.so in Linux) and the Python shared library in ./build/python (pyzentas.so in Linux). These can be moved/copied elsewhere manually, there is currently no install option for zentas.

USING

Example use cases of the C++ library and headers are in testsexamples, with the corresponding executables in build/testsexamples. There is an example of clustering dense vectors (exdense.cpp), sparse vectors (exsparse.cpp), and sequences (exwords.cpp).

To use the Python library, make sure pyzentas.so is on PYTHONPATH, for example you can use sys.path.append(/path/to/pyzentas.so). Examples using pyzentas are in python/examples.py. More information can be obtained from the doc strings, try

import pyzentas
help(pyzentas)

Doesn't work, or missing a feature?

Please raise an issue in the zentas repository

zentas's People

Contributors

newling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

zentas's Issues

[Error] make -j5

zentas/CMakeFiles/zentas.dir/build.make:422: recipe for target 'zentas/CMakeFiles/zentas.dir/src/textfilezentas.cpp.o' failed

make[2]: *** [zentas/CMakeFiles/zentas.dir/src/textfilezentas.cpp.o] Error 1
zentas/CMakeFiles/zentas.dir/build.make:158: recipe for target 'zentas/CMakeFiles/zentas.dir/src/dispatch.cpp.o' failed

make[2]: *** [zentas/CMakeFiles/zentas.dir/src/dispatch.cpp.o] Error 1
CMakeFiles/Makefile2:85: recipe for target 'zentas/CMakeFiles/zentas.dir/all' failed

make[1]: *** [zentas/CMakeFiles/zentas.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

would you please tell me how to fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.