Original paper
Unsupervised learning of Gaussian mixture models uses a minimum message length like criterion to learn the optimal number of components in a finite Gaussian mixture model.
To install this python package:
pip install gmm-mml
This implementation is a port from the orginal authors matlab code with small modifications and it is built as a sklearn wrapper. The dependencies are:
numpy
scipy
sklearn
To run the example scripts it also advisable to install matplotlib
It should be compatible with python2 and python3
The following points were generated using three bivariate Gaussian distributions.
The clustering algorithm correctly converges to those distributions:from gmm_mml import GmmMml
unsupervised=GmmMml(plots=True)
unsupervised.fit(X)
It is also possible to visualize this process GmmMml(plots=True,live_2d_plot=False)
:
Available sklearn methods:
.fit()
- fit the finite mixture model.fit_transform()
- fit and return inputs posterior probability.transform()
- return inputs posterior probability.predict()
- return inputs cluster.predict_proba()
- same as.transform()
.sample()
- sample new data from the fitted mixture model
On folders ./example_scipts and ./tutorials there are examples on how to use the code
An example jupyter notebook is provided link
- Refactoring
- Docs
- Make it work with 1-d data (bug)
- Support other covariance types (right now only 'full' is supported, i.e., each component has its own general covariance matrix)