Generic implementations of clustering algorithms.
huonw / cogset Goto Github PK
View Code? Open in Web Editor NEWGeneric implementations of clustering algorithms.
Home Page: http://huonw.github.io/cogset/cogset
License: Apache License 2.0
Generic implementations of clustering algorithms.
Home Page: http://huonw.github.io/cogset/cogset
License: Apache License 2.0
Generic implementations of clustering algorithms.
The current design chooses the first k points as starting values.
If any of these data points are identical this leads the first to be assigned all the points and the second to be assigned no points (and then generating a NaN mean over its 0 members, and derailing the whole clustering algorithm).
There are 2 solutions I can think of to avoid this condition:
The first one seems simple and more predictably performant to start from.
I would like to be able to use arbitrary-length Vec<f64>
s as points in the clustering algorithms. I assume that the problem is likely that it would be ideal to have compile-time checks on the length of the vectors, to ensure that you don't accidentally add a dimension somewhere.
My use-case is computing a series of audio spectrum coefficients, and because of this limitation changing the number of bins (dimensions) means that I have to recompile the code. I would like to allow the end-user to change the number of bins, but I'm not sure how to do that without arbitrary-length vectors. I'm not particularly concerned about efficiency of working on the stack.
Thought I'd open an issue since this hasn't already been opened and closed, but I understand that it's an enhancement not a bug.
Could you please provide some performance benchmarks against Scikit-Learn for training and inference of Clustering algorithms that you made? In case you have the time, also against the Intel DAAL library?
I am looking for a starting point to implement fast clustering algorithms and want to know if switching to Rust would have significant gains as compared to scikit learn in Python (uses numpy, which can use MKL backend) OR as compared to scikit-learn in Intel Distribution for python that uses the aforementioned DAAL library.
Thanks!
I use a custom implementation of hierarchical bottom-up clustering with complete linkage in a private project and I would like to move it to a public external crate for obvious reasons, i.e., help others, get help, thin my code base and focus on my main idea.
Well, I found your crate and I was wondering if it is the right place. It should be fairly simple to incorporate even more hierarchical linkage criterias.
cc @huonw (added by @huonw to give me an email about this issue)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.