Code Monkey home page Code Monkey logo

sompy's People

Contributors

anandjeyahar avatar businessglitch avatar dfhssilva avatar ericschles avatar frsnjung avatar graeme44 avatar htwangtw avatar imaculate avatar ivallesp avatar jpeer264 avatar jreades avatar lbugnon avatar oliviaguest avatar ricardomourarpm avatar rsarai avatar sebastiandev avatar sevamoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sompy's Issues

SOMPY missing ipdb

I recently downloaded SOMPY and attempted the run the example code "California Housing". When I attempt to import SOMFactory from sompy, I receive this error...

~/anaconda3/lib/python3.6/site-packages/SOMPY-1.0-py3.6.egg/sompy/sompy.py in ()
30
31 #lbugnon
---> 32 import sompy,ipdb
33 #
34

ModuleNotFoundError: No module named 'ipdb'

How do I fix this?

explain of update_codebook_voronoi && why _distance_matrix doesn't change?

Dear all,
Could anybody help me with the implementation understanding of update_codebook_voronoi function? I have several questions about it:

        ...
        P = csr_matrix((val, (row, col)), shape=(self.codebook.nnodes,self._dlen))
        S = P.dot(training_data) 

        # neighborhood has nnodes*nnodes and S has nnodes*dim
        # ---> Nominator has nnodes*dim
        nom = neighborhood.T.dot(S)
        nV = P.sum(axis=1).reshape(1, self.codebook.nnodes)
        denom = nV.dot(neighborhood.T).reshape(self.codebook.nnodes, 1)
        new_codebook = np.divide(nom, denom)
  1. So, matrix P shows us which data suits to which som map node. And I can't understand the matrix S, what for is it? When we do P X training_data -> we sum up components of vectors which has the same bmu, what is the meaning for it? And would be helpful to get description about nom, nV and denom matrices, what for are they?
  2. _distance_matrix is always the same. If Im not wrong, som's map units should change the distance during the training process, but in implementation this matrix is always the same. Could you explain why?

Visualization

When running

som = SOMFactory.build(traindata, mapsize, lattice='rect', normalization=None)
som.train(n_job=1,verbose=0)
from sompy.visualization.mapview import View2D
view2D = View2D(10,10,'title')
view2D.show(som)

I get error:

File "main.py", line 60, in
view2D.show(som)
File "/usr/local/lib/python3.5/dist-packages/SOMPY-1.0-py3.5.egg/sompy/visualization/mapview.py", line 62, in show
File "/usr/local/lib/python3.5/dist-packages/SOMPY-1.0-py3.5.egg/sompy/visualization/mapview.py", line 10, in _calculate_figure_params
AttributeError: 'NoneType' object has no attribute 'denormalize_by'

Any thoughts?

Also, if using normalization='var' instead of 'None', I had to add a plt.show() inside view.show(), otherwise figure is not displayed

Learning rate

Hi,

Could someone please let me know where in the code the learning rate for updating the node's weight vector has been defined?
Any help is highly appreciated.

Thanks,
Parisa

In the SOM train function the verbose mode cannot be set to final

According to the displayed error below, verbose can only be debug, info or None.

:param verbose: verbosity, could be 'debug', 'info' or None
"""
logging.root.setLevel(getattr(logging, verbose.upper()) if verbose else logging.ERROR)

logging.info(" Training...")

AttributeError: 'module' object has no attribute 'FINAL'`

I'm guessing i've installed the wrong version, since i've seen its usage in one of your provided examples. Link: http://nbviewer.jupyter.org/gist/sevamoo/e93699fdb481de1a932b

I've installed SOMPY using: pip install git+https://github.com/sevamoo/SOMPY.git

Please let me know how i can fix this. Thanks.

Example request on SOM initialization with trained data

Would be nice to leave a sample code or in the examples folder on how to initialize a SOM with a previously trained codebook for incremental learning.

saving/loading a codebook is pretty easy to add here :

# saving
the_codebook = np.save(the_filename, SOM.codebook)
# loading
the_codebook = np.load(the_filename)

But how to reuse it?

How to get cluster labels of each raw data entry?

I tried using sompy to cluster my dataset.

My data size is 2000 and mapsize is [10,10]

When I cluster with som.cluster(n_clusters=4), it returned a set of labels of which the length is 100, and that is the neuron count.

I want to get the cluster labels of my raw data. That means the I should get 2000 labels. Could you please show me how to get that?

Bug in training, codebook

I encountered a bug when training the som. In some cases (I don't know why yet) seems that the codebook is replicated in one direction. This is, centroids of every unit in a row have the same value.

This happens in the rough training fase, using the same parameters for different datasets of the same size. I'm trying to figure out why, but in the meantime I attach here the code to replicate the bug. Apparently this is only happening when using PCA.

bugreplication_minimal.py.txt
traindata1.txt
traindata2.txt

Using too much RAM on large dataset.

I am running document clustering using Sompy, I was following the example given along with this project.
I had lists of documents.
Each element in list contains text contained in respective document.
So I followed following steps -

  • Used TF-IDF to vectorize the document.
  • I got a sparse matrix.
  • Converted Sparse matrix to dense matrix and then to square matrix.

When I run the following command
som = sompy.SOMFactory.build(document_list, mapsize, mask=None, mapshape='planar', lattice='rect', normalization='var', initialization='pca', neighborhood='gaussian', training='batch', name='sompy')

mapsize is 20x20
size of document_list is 92520x92520
I read online and people suggested using batch training and reducing the features using pca, I have done that, but still I find my RAM getting 100% utilised, (I have 126 GB RAM, 12 Core processor) and have to interrupt the program.

Any help at this time will be appreciated.

Citation?

How would you like the library to be cited, e.g., in a journal publication?

coordinates for the maps

Dear All,
Anyone who know how to get coordinates of all data points in the maps?
Thank you so much for your help.
Sincerely yours,

ImportError: cannot import name 'SOMFactory'

Hello,

I have installed sompy without any errors (https://github.com/sevamoo/SOMPY).
However when i try:

import sompy

I receive an error code:

File "C:\Anaconda3\lib\site-packages\sompy-1.0-py3.5.egg\sompy__init__.py", line 30, in
from sompy import SOMFactory
ImportError: cannot import name 'SOMFactory'

I am using Python 3.5.1 (Anaconda 4.1.1 for Windows)
Any help or suggestions would be greatly appreciated.

Stefan

ipdb depedency problem

I would like to use SOMPY from my Jupiter Notebook Hub.

> print (sys.version)
2.7.12 | packaged by conda-forge | (default, Feb  9 2017, 14:36:30) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]
> !git clone https://github.com/sevamoo/SOMPY.git;
> !cd SOMPY; python setup.py install
running install
running bdist_egg
running egg_info
writing requirements to SOMPY.egg-info/requires.txt
writing SOMPY.egg-info/PKG-INFO
writing top-level names to SOMPY.egg-info/top_level.txt
writing dependency_links to SOMPY.egg-info/dependency_links.txt
reading manifest file 'SOMPY.egg-info/SOURCES.txt'
writing manifest file 'SOMPY.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/sompy
copying build/lib/sompy/codebook.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib/sompy/__init__.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib/sompy/normalization.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib/sompy/neighborhood.py -> build/bdist.linux-x86_64/egg/sompy
creating build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/view.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/hitmap.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/umatrix.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/__init__.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/bmuhits.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/mapview.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/histogram.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/visualization/dotmap.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib/sompy/sompy.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib/sompy/decorators.py -> build/bdist.linux-x86_64/egg/sompy
byte-compiling build/bdist.linux-x86_64/egg/sompy/codebook.py to codebook.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/normalization.py to normalization.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/neighborhood.py to neighborhood.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/view.py to view.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/hitmap.py to hitmap.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/umatrix.py to umatrix.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/bmuhits.py to bmuhits.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/mapview.py to mapview.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/histogram.py to histogram.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/dotmap.py to dotmap.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/sompy.py to sompy.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/decorators.py to decorators.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating 'dist/SOMPY-1.0-py2.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing SOMPY-1.0-py2.7.egg
Removing /opt/conda/envs/python2/lib/python2.7/site-packages/SOMPY-1.0-py2.7.egg
Copying SOMPY-1.0-py2.7.egg to /opt/conda/envs/python2/lib/python2.7/site-packages
SOMPY 1.0 is already the active version in easy-install.pth

Installed /opt/conda/envs/python2/lib/python2.7/site-packages/SOMPY-1.0-py2.7.egg
Processing dependencies for SOMPY==1.0
Searching for numexpr==2.6.2
Best match: numexpr 2.6.2
Adding numexpr 2.6.2 to easy-install.pth file

Using /opt/conda/envs/python2/lib/python2.7/site-packages
Searching for scikit-learn==0.18
Best match: scikit-learn 0.18
Adding scikit-learn 0.18 to easy-install.pth file

Using /opt/conda/envs/python2/lib/python2.7/site-packages
Searching for scipy==0.19.0
Best match: scipy 0.19.0
Adding scipy 0.19.0 to easy-install.pth file

Using /opt/conda/envs/python2/lib/python2.7/site-packages
Searching for numpy==1.12.0
Best match: numpy 1.12.0
Adding numpy 1.12.0 to easy-install.pth file

Using /opt/conda/envs/python2/lib/python2.7/site-packages
Finished processing dependencies for SOMPY==1.0

Because trying import sompy raises exception about lack of ipdb module (ipdb is not listed in setup.py), I have installed it manually:

>!pip install ipdb
Collecting ipdb
  Downloading ipdb-0.10.3.tar.gz
Requirement already satisfied: setuptools in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipdb)
Requirement already satisfied: ipython<6.0.0,>=0.10.2 in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipdb)
Requirement already satisfied: pickleshare in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: backports.shutil-get-terminal-size; python_version == "2.7" in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: decorator in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.4 in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: traitlets>=4.2 in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: pexpect; sys_platform != "win32" in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: pathlib2; python_version == "2.7" or python_version == "3.3" in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: simplegeneric>0.8 in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: pygments in /opt/conda/envs/python2/lib/python2.7/site-packages (from ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: six>=1.9.0 in /opt/conda/envs/python2/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: wcwidth in /opt/conda/envs/python2/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: ipython-genutils in /opt/conda/envs/python2/lib/python2.7/site-packages (from traitlets>=4.2->ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: enum34; python_version == "2.7" in /opt/conda/envs/python2/lib/python2.7/site-packages (from traitlets>=4.2->ipython<6.0.0,>=0.10.2->ipdb)
Requirement already satisfied: scandir in /opt/conda/envs/python2/lib/python2.7/site-packages (from pathlib2; python_version == "2.7" or python_version == "3.3"->ipython<6.0.0,>=0.10.2->ipdb)
Building wheels for collected packages: ipdb
  Running setup.py bdist_wheel for ipdb ... -� �done
  Stored in directory: /home/jovyan/.cache/pip/wheels/cd/9e/a2/b521d7d90da1032f805e08bf00dce70101ddc39dcb1bb245cb
Successfully built ipdb
Installing collected packages: ipdb
Successfully installed ipdb-0.10.3
> import sys
> sys.path.insert(0, '/SOMPY/sompy/')
> import sompy
MultipleInstanceErrorTraceback (most recent call last)
<ipython-input-7-e59cc99b7bc9> in <module>()
      1 import sys
      2 sys.path.insert(0, '/SOMPY/sompy/')
----> 3 import sompy

build/bdist.linux-x86_64/egg/sompy/__init__.py in <module>()

/opt/conda/envs/python2/lib/python2.7/site-packages/SOMPY-1.0-py2.7.egg/sompy/sompy.pyc in <module>()
     30 
     31 #lbugnon
---> 32 import sompy,ipdb
     33 #
     34 

/opt/conda/envs/python2/lib/python2.7/site-packages/ipdb/__init__.py in <module>()
      5 # https://opensource.org/licenses/BSD-3-Clause
      6 
----> 7 from ipdb.__main__ import set_trace, post_mortem, pm, run             # noqa
      8 from ipdb.__main__ import runcall, runeval, launch_ipdb_on_exception  # noqa
      9 

/opt/conda/envs/python2/lib/python2.7/site-packages/ipdb/__main__.py in <module>()
     60     # the instance method will create a new one without loading the config.
     61     # i.e: if we are in an embed instance we do not want to load the config.
---> 62     ipapp = TerminalIPythonApp.instance()
     63     shell = get_ipython()
     64     def_colors = shell.colors

/opt/conda/envs/python2/lib/python2.7/site-packages/traitlets/config/configurable.pyc in instance(cls, *args, **kwargs)
    421             raise MultipleInstanceError(
    422                 'Multiple incompatible subclass instances of '
--> 423                 '%s are being created.' % cls.__name__
    424             )
    425 

MultipleInstanceError: Multiple incompatible subclass instances of TerminalIPythonApp are being created.

What version of ipdb are you using?

Visualization for bmuhits.py

Greeting,
I was able to run sompy with example code.
I tried to visualize BmuHitsView,but i encountered some error : -
AttributeError: ‘module’ object has no attribute ‘bmuhits’

Do you have any idea/suggestion for this matter?
Thank you in advance.

View2DPacked.show() Exception: Data must be 1-dimensional

Hey, I'm not quite sure why it is trowing this error...
Can anyone help?

The imported file is a 800 by 750 matrix (or something similar)

My code:

    data = pd.read_csv(filename, header=None, index_col=False)
    mapsize = [20,20]
    som = sompy.SOMFactory.build(data, mapsize, mask=None, mapshape='planar', lattice='rect', normalization='var', initialization='pca', neighborhood='gaussian', training='batch', name='sompy')
    som.train(n_job=1, verbose='info')
    v = sompy.mapview.View2DPacked(50, 50, 'test',text_size=8)
    v.show(som, what='codebook', which_dim=[0,1], cmap=None, col_sz=6)
    v.save('2d_packed_test.png')

I get the following:

Traceback (most recent call last):
  File "SOM-solve.py", line 21, in <module>
    v.show(som, what='codebook', which_dim=[0,1], cmap=None, col_sz=6)
  File "build/bdist.linux-x86_64/egg/sompy/visualization/mapview.py", line 120, in show
    axis_num) = self._calculate_figure_params(som, which_dim, col_sz)
  File "build/bdist.linux-x86_64/egg/sompy/visualization/mapview.py", line 14, in _calculate_figure_params
    codebook = som._normalizer.denormalize_by(som.data_raw, som.codebook.matrix)
  File "build/bdist.linux-x86_64/egg/sompy/normalization.py", line 49, in denormalize_by
    return n_vect * st + me
  File "/usr/lib/python2.7/dist-packages/pandas/core/ops.py", line 620, in wrapper
    dtype=dtype)
  File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 225, in __init__
    raise_cast_failure=True)
  File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 2885, in _sanitize_array
    raise Exception('Data must be 1-dimensional')
Exception: Data must be 1-dimensional

Tuple index out of range help wanted

My goal is to enter binary vector data (vectors are each 16 values long) and extract the som, and label every piece of enetered data as well as name the clusters.
I entered the data without names, because I don't know how SOM would respond to strings.

import matplotlib.pylab as plt  # The imports
import numpy as np
from time import time
import sompy
import pandas as pd

data = open('verbsnumbers.csv', 'rt', encoding='cp1252')  # The data
data = data.read() 
data = data.replace(';','')  # this deletes all semicolons. Data seems to be read correctly and can be printed as an array of binary vectors
np.asarray(data)  #converts data into a numpy array

mapsize = [10,10]  # The SOM
som = sompy.SOMFactory.build(data, mapsize, mask=None, mapshape='planar', lattice='rect', initialization='pca', training='batch', name='sompy')  # This line causes an error. Should I change initialization or neighborhood methods? To what?

Here is the traceback:
Traceback (most recent call last):
File "D:/Anže/PycharmProjects/Self Organizing Map/SOM.py", line 54, in
som = sompy.SOMFactory.build(data, mapsize, mask=None, mapshape='planar', lattice='rect', initialization='pca', training='batch', name='sompy')
File "D:\Anže\Python_Bratislava\lib\site-packages\sompy\sompy.py", line 96, in build
mapshape, lattice, initialization, training, name, component_names)
File "D:\Anže\Python_Bratislava\lib\site-packages\sompy\sompy.py", line 129, in init
self._data = normalizer.normalize(data) if normalizer else data
File "D:\Anže\Python_Bratislava\lib\site-packages\sompy\normalization.py", line 38, in normalize
me, st = self._mean_and_standard_dev(data)
File "D:\Anže\Python_Bratislava\lib\site-packages\sompy\normalization.py", line 35, in _mean_and_standard_dev
return np.mean(data, axis=0), np.std(data, axis=0)
File "D:\Anže\Python_Bratislava\lib\site-packages\numpy\core\fromnumeric.py", line 2909, in mean
out=out, **kwargs)
File "D:\Anže\Python_Bratislava\lib\site-packages\numpy\core_methods.py", line 57, in _mean
rcount = _count_reduce_items(arr, axis)
File "D:\Anže\Python_Bratislava\lib\site-packages\numpy\core_methods.py", line 50, in _count_reduce_items
items *= arr.shape[ax]
IndexError: tuple index out of range

Data files:
verbs.xlsx
verbsnumbers.xlsx <= The actual training data, except the data I used is in .csv form.

Licensing terms

What are the licensing terms? I'm looking for a SOM library to use in commercial software.

Bug - self._bmu[0] not updated at last training iteration, BmuHitsView visualization incorrect

Hello,

I believe that the BMU stored for the training data in self._bmu[0] is not updated after the last training iteration.
(https://github.com/sevamoo/SOMPY/blob/master/sompy/sompy.py, line 342-343, 360-361)

As BmuHitsView uses self._bmu[0] in visualizing the hit count, the plot will be out of sync with the latest version of the SOM map.
(https://github.com/sevamoo/SOMPY/blob/master/sompy/visualization/bmuhits.py, line 24)

As a side effect, the reported quantization error, based on self._bmu[1], may also be one iteration behind.

In this example, I train a SOM with an arbitrary trainingArray and mapSize, then plot the hit counts.

import sompy
from sompy.visualization.bmuhits import BmuHitsView

somModel=sompy.SOMFactory().build(trainingArray,
                                  mapSize,
                                  normalization = 'var', 
                                  initialization='random', 
                                  component_names=fieldsUsedSOM)

somModel.train(n_job=4, verbose=None, train_rough_len=3, train_finetune_len=6)

vhts  = BmuHitsView(200,200,"Hits Map",text_size=12)
vhts.show(somModel, anotate=True, onlyzeros=False, labelsize=8, cmap="Greys", logaritmic=False)

I then compute the matching BMUs for the same trainingArray, in two ways.

indexes1=somModel.project_data(trainingArray)
indexes2=somModel.find_bmu(somModel._normalizer.normalize(trainingArray), njb=4)

indexesMatch=np.all(indexes1==indexes2[0])

indexesMatch will evaluate to be True, so the two approaches give the same, correct BMU indices. Both find_bmu (line 390) and project_data (line 441) use the updated codebook.matrix (line 343).
(https://github.com/sevamoo/SOMPY/blob/master/sompy/sompy.py)

However, self._bmu[0] will be clearly different from indexes1 and indexes2[0], and this is what I believe to be the key indicator of the bug.

A potential (untested) fix I see would be to pass a flag to the last-executed _batchtrain in finetune_train to run this at line 359:
(https://github.com/sevamoo/SOMPY/blob/master/sompy/sompy.py)

if lastFlag:
    bmu = self.find_bmu(data, njb=njob)

followed by what is already there:

bmu[1] = np.sqrt(bmu[1] + fixed_euclidean_x2)
self._bmu = bmu

Of course, this snippet suggests that the quantization error may also be reported for the previous iteration, as bmu[1] also comes from an earlier find_bmu call. I did not thoroughly go through the logic of this, though.

Thanks for this useful package,
Robert Beck

Clusters in HitMapView

Hi,

I have a question about the HitMapView. I do clustering as follows and the output is perfectly fine:

cl = som.cluster(n_clusters=3)

But the HitMapView always uses the default number of clusters which is 8 and I could not find a way to change it:

h = hitmap.HitMapView(10, 10, 'hitmap', text_size=8, show_text=True)

How can I specify number of clusters in a HitMapView to a different number?

Regards
Amin

hist2d

hist2d function doesn't work properly.

neighborhood.calculate

Hello there,

As far as I can follow the code, the distance_matrix defined in calculate_map_dist function in sompy.py (line 190), has (nnodesXnnodes) dimension. However, in Neighborhood.py at line 27 (np.exp(-1.0distance_matrix/(2.0radius**2)).reshape(dim, dim)), it has been reshaped by dimXdim. Here by dim you mean nnodes, right? I got confused since you have defined dim as the number of features before.

How to install library?

Sorry I'm very new to Python, I usually use pip or a setup.py to install a library. How do you install this one?

Using SOMPY for text clustering?

Hi I'm new to SOM's and stumbled across your SOMPY library. Your examples work pretty well so far but I wonder if I also can use this library to cluster texts and how the feature vectors for training etc should look like to achieve that. Do you have a guess/hint or, by any chance, an example? My goal is to use the data of Grimm's fairytales. I want to cluster the text by its affiliation to a fairytale. If I get a new text-input, I want to get a SOM that clusters text in a way, that I can see the relations between the input text and fairytales. Is this possible?

Greetings

Hexagonal grid

Hi there,

as far as I understood the source code no hexagonal grid option is implemented yet. Is somebody still working on this issue? I read a lot about the advantages of hexagonal grids and wonder why this common option is still missing... Would appreciate any comments.!

David

can not import SOM

I am using Python 3.5.1 with PyCharm IDE on Windows. I am getting below error:

can you please help me?

C:\Python35\python.exe C:/Users/azzet/PycharmProjects/PythonSom.py
Traceback (most recent call last):
File "C:/Users/azzet/PycharmProjects/PythonSom.py", line 4, in
import sompy
File "C:\Python35\lib\site-packages\sompy_init_.py", line 2, in
from sompy import SOM
ImportError: cannot import name 'SOM'

Process finished with exit code 1

Code crashes when mapszie is [200,200]

Hi everyone, I justwanted to check if this is a know issue.

on default when you don't specify mapsize, it sets it to [20,56] but if you want to increase the number of nodes to be trained, you will have to increase the mapsize. I did so and it crashes at [200,200].

Is there any reason why this is happening?

minor typo in /visualization/dotmap.py

Line 66 in /visualization/dotmap.py reads:
plt.subplots_adjust(hspace=.16, swspace=.05)

subplots_adjust() has no 'swspace' parameter but it does have a 'wspace', so the line should presumably read:
plt.subplots_adjust(hspace=.16, wspace=.05)

This library is great sevamoo, thanks

Issue with n_job = -1

Hi! I'm a student of Computer Engineering and I'm trying to use SOMPY. I want to speed up the algorithm (reducing the total time elapsed) and I see that in the code the train function has a parameter (njob) that can help me. I have been studying this function and supposedly I have to put njob = -1 in order to use all the cores of my processor. The problem is when I call the train function with njob = -1, the algorithm freezes and it does nothing. The only way that works fine is with njob = 1, because if I put njob = 2 or njob = 6, it finishes later than with njob = 1, so I can't understand (with njob = 6 it supposed to run faster that with njob = 1).

I REALLY NEED a working paralelization of this algorithm in order to run faster for large maps (100x100 approximately), so if there is a solution to this problem I will be really grateful if you can help me.

Thank you for your time.

IndexError : tuple out of range when running californiaHousing example

Hi,

The California Housing example throws the following IndexError : tuple out of range:

Traceback (most recent call last):

  File "<ipython-input-12-9bf784a5362f>", line 1, in <module>
    runfile('C:/Github/sompy-scripts/example.py', wdir='C:/Github/sompy-scripts')

  File "c:\python36\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "c:\python36\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Github/sompy-scripts/example.py", line 21, in <module>
    component_names=names)

  File "c:\python36\lib\site-packages\sompy-1.0-py3.6.egg\sompy\sompy.py", line 93, in build
    mapshape, lattice, initialization, training, name, component_names)

  File "c:\python36\lib\site-packages\sompy-1.0-py3.6.egg\sompy\sompy.py", line 126, in __init__
    self._data = normalizer.normalize(data) if normalizer else data

  File "c:\python36\lib\site-packages\sompy-1.0-py3.6.egg\sompy\normalization.py", line 38, in normalize
    me, st = self._mean_and_standard_dev(data)

  File "c:\python36\lib\site-packages\sompy-1.0-py3.6.egg\sompy\normalization.py", line 35, in _mean_and_standard_dev
    return np.mean(data, axis=0), np.std(data, axis=0)

  File "c:\python36\lib\site-packages\numpy\core\fromnumeric.py", line 2942, in mean
    out=out, **kwargs)

  File "c:\python36\lib\site-packages\numpy\core\_methods.py", line 56, in _mean
    rcount = _count_reduce_items(arr, axis)

  File "c:\python36\lib\site-packages\numpy\core\_methods.py", line 50, in _count_reduce_items
    items *= arr.shape[ax]

IndexError: tuple index out of range

I'm using Python 3.6 obviously. Is that example compatible with this version of Python?

visualisation not available

I installed the sompy package successfully and I was able to run the following code (http://nbviewer.jupyter.org/gist/sevamoo/ec0eb28229304f4575085397138ba5b1) until the training step included:

import matplotlib.pylab as plt
import pandas as pd
import numpy as np
from time import time
import sys
import sompy

### A toy example: two dimensional data, four clusters

dlen = 200
Data1 = pd.DataFrame(data= 1*np.random.rand(dlen,2))
Data1.values[:,1] = (Data1.values[:,0][:,np.newaxis] + .42*np.random.rand(dlen,1))[:,0]


Data2 = pd.DataFrame(data= 1*np.random.rand(dlen,2)+1)
Data2.values[:,1] = (-1*Data2.values[:,0][:,np.newaxis] + .62*np.random.rand(dlen,1))[:,0]

Data3 = pd.DataFrame(data= 1*np.random.rand(dlen,2)+2)
Data3.values[:,1] = (.5*Data3.values[:,0][:,np.newaxis] + 1*np.random.rand(dlen,1))[:,0]


Data4 = pd.DataFrame(data= 1*np.random.rand(dlen,2)+3.5)
Data4.values[:,1] = (-.1*Data4.values[:,0][:,np.newaxis] + .5*np.random.rand(dlen,1))[:,0]


Data1 = np.concatenate((Data1,Data2,Data3,Data4))

fig = plt.figure()
plt.plot(Data1[:,0],Data1[:,1],'ob',alpha=0.2, markersize=4)
fig.set_size_inches(7,7)

mapsize = [20,20]
som = sompy.SOMFactory.build(Data1, mapsize, mask=None, mapshape='planar', lattice='rect', normalization='var', initialization='pca', neighborhood='gaussian', training='batch', name='sompy')  # this will use the default parameters, but i can change the initialization and neighborhood methods
som.train(n_job=1, verbose='info')

However the next step doesn't work, the visualization tools are not accessible:

v = sompy.mapview.View2DPacked(50, 50, 'test',text_size=8)  
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-33-cf58d4047cc8> in <module>()
----> 1 v = sompy.mapview.View2DPacked(50, 50, 'test', text_size=8)

AttributeError: 'module' object has no attribute 'mapview'

I'm using

$ python --version
Python 3.4.4 :: Anaconda 2.3.0 (x86_64)

SOMPY basic usage

I tried this example: http://bit.ly/1eZvaCM
And it does not seems to be up to date. I made some fixes, but, I can not run it yet.

I changed:

sm = SOM.SOM('sm', Data, mapsize = [msz0, msz1],norm_method = 'var',initmethod='pca')
sm.train(n_job = 1, shared_memory = 'no',verbose='final')

to:

sm = SOM.SOM(data = Data, neighborhood = neighborhood, mapsize = [msz0, msz1], initialization ='pca') #, norm_method = 'var')
sm.train(n_job = 1, shared_memory = False, verbose='info')

Neighborhood is a required attribute. So, I have to instantiate an object of the type NeighborhoodFactory, right?
So, I created it and used it as a parameter of SOM.SOM method.

neighborhood = NeighborhoodFactory.build('gaussian')

I don't know if this is the right choice. And, also, I can not visualize the resulting data as the example because of the changes made in the visualization methods.

Is it possible to assign labels to data and to verify in which group they were grouped in?
I also tried this code below (according to this) and it raises a TypeError.

sm = SOM.SOM(data = Data, neighborhood = neighborhood, mapsize = [msz0, msz1], initialization ='pca') 
labels = ['a','b','c'] 
sm.data_labels(labels)

Is there another basic usage example?

How can I save the time of a function in a variable

Hi everyone!

I know the code is using time decorators for measuring the time of a function. I need to save in a variable the time of the train function. My idea is that the train function can return the time elapsed. How can I do that?

Thanks! Sorry for my bad english!

setup.py error

When installing requirements using setup file python setup.py install

this error log appears:

`running install
running bdist_egg
running egg_info
creating SOMPY.egg-info
writing dependency_links to SOMPY.egg-info/dependency_links.txt
writing requirements to SOMPY.egg-info/requires.txt
writing SOMPY.egg-info/PKG-INFO
writing top-level names to SOMPY.egg-info/top_level.txt
writing manifest file 'SOMPY.egg-info/SOURCES.txt'
reading manifest file 'SOMPY.egg-info/SOURCES.txt'
writing manifest file 'SOMPY.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/sompy
copying sompy/decorators.py -> build/lib.linux-x86_64-2.7/sompy
copying sompy/neighborhood.py -> build/lib.linux-x86_64-2.7/sompy
copying sompy/normalization.py -> build/lib.linux-x86_64-2.7/sompy
copying sompy/codebook.py -> build/lib.linux-x86_64-2.7/sompy
copying sompy/init.py -> build/lib.linux-x86_64-2.7/sompy
copying sompy/sompy.py -> build/lib.linux-x86_64-2.7/sompy
creating build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/view.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/hitmap.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/dotmap.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/umatrix.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/mapview.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/histogram.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/init.py -> build/lib.linux-x86_64-2.7/sompy/visualization
copying sompy/visualization/bmuhits.py -> build/lib.linux-x86_64-2.7/sompy/visualization
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/sompy
copying build/lib.linux-x86_64-2.7/sompy/decorators.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib.linux-x86_64-2.7/sompy/neighborhood.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib.linux-x86_64-2.7/sompy/normalization.py -> build/bdist.linux-x86_64/egg/sompy
creating build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/view.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/hitmap.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/dotmap.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/umatrix.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/mapview.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/histogram.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/init.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/visualization/bmuhits.py -> build/bdist.linux-x86_64/egg/sompy/visualization
copying build/lib.linux-x86_64-2.7/sompy/codebook.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib.linux-x86_64-2.7/sompy/init.py -> build/bdist.linux-x86_64/egg/sompy
copying build/lib.linux-x86_64-2.7/sompy/sompy.py -> build/bdist.linux-x86_64/egg/sompy
byte-compiling build/bdist.linux-x86_64/egg/sompy/decorators.py to decorators.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/neighborhood.py to neighborhood.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/normalization.py to normalization.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/view.py to view.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/hitmap.py to hitmap.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/dotmap.py to dotmap.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/umatrix.py to umatrix.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/mapview.py to mapview.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/histogram.py to histogram.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/init.py to init.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/visualization/bmuhits.py to bmuhits.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/codebook.py to codebook.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/init.py to init.pyc
byte-compiling build/bdist.linux-x86_64/egg/sompy/sompy.py to sompy.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying SOMPY.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist/SOMPY-1.0-py2.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing SOMPY-1.0-py2.7.egg

Adding SOMPY 1.0 to easy-install.pth file

Processing dependencies for SOMPY==1.0
Searching for numexpr>=2.5
Reading https://pypi.python.org/simple/numexpr/
Best match: numexpr 2.6.2
Downloading https://pypi.python.org/packages/31/43/7777ab9535c416faabd1865453389260fdad0a23a714609c13112885009a/numexpr-2.6.2.tar.gz#md5=943f8e4be7569b1ad01b10cbaa002a5c
Processing numexpr-2.6.2.tar.gz
Writing /tmp/easy_install-uatEIX/numexpr-2.6.2/setup.cfg
Running numexpr-2.6.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-uatEIX/numexpr-2.6.2/egg-dist-tmp-YN4_tF
Traceback (most recent call last):
File "setup.py", line 14, in
'scikit-learn >= 0.16', 'numexpr >= 2.5']
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/venv/local/lib/python2.7/site-packages/setuptools/command/install.py", line 74, in run
self.do_egg_install()
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/install.py", line 97, in do_egg_install
cmd.run()
File "/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 358, in run
self.easy_install(spec, not self.no_deps)
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 574, in easy_install
return self.install_item(None, spec, tmpdir, deps, True)
File "/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 625, in install_item
self.process_distribution(spec, dist, deps)
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 671, in process_distribution
[requirement], self.local_index, self.easy_install
File "/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 580, in resolve
dist = best[req.key] = env.best_match(req, ws, installer)
File "
/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 818, in best_match
return self.obtain(req, installer) # try and download/install
File "/venv/local/lib/python2.7/site-packages/pkg_resources.py", line 830, in obtain
return installer(requirement)
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 593, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 623, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 809, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 1015, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "
/venv/local/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 1000, in run_setup
run_setup(setup_script, args)
File "/venv/local/lib/python2.7/site-packages/setuptools/sandbox.py", line 50, in run_setup
lambda: execfile(
File "
/venv/local/lib/python2.7/site-packages/setuptools/sandbox.py", line 100, in run
return func()
File "~/venv/local/lib/python2.7/site-packages/setuptools/sandbox.py", line 52, in
{'file':setup_script, 'name':'main'}
File "setup.py", line 227, in

File "setup.py", line 64, in setup_package

ImportError: No module named numpy.distutils.core
`

My solution was creating requirements.txt file containing libraries and install them using pip install -r requirements.txt

How can I use sompy cluster for vectors

Hi! I used to cluster data in pymvpa2, it can return a labeled 1 dimention matrix like:array([0,0,2,0,1,...,0,1,2]),which can correspond to my data,my data is a matrix (n*m) which is composited a vectors set that contains n vectors and each vector is m dimentions .

show() Exception: Data must be 1-dimensional

When It tries to show any result from the som like: codebook, cluster, or hitmap the normalisation method estimates the mean and the std with pandas. However since data is not 1 dimensional and exception gets triggered

_calculate_ms_and_mpd()

I am trying to understand how trainlen and related variables are calculated using: _calculate_ms_and_mpd, what does ms and mpd stand for?

I think ideally, I would like to set up my own training to be able to train again after training, as if I am not wrong, neither is possible at the moment - yes?

I will attempt to do this, but there are some bits of the code I don't understand yet.

Basic example not working

I am trying to run (and follow through) the basic example shared by the author. It looks like the structure of the modules has changed and the examples do not work anymore. @oliviaguest @sevamoo could you please help me make it work? For starters, I get the below error on running the 3rd cell from the Basic example:
AttributeError: 'module' object has no attribute 'SOMFactory'

P.S. I installed sompy using pip.

problem with running the examples from notebooks

Hello,
I get this error when I try to run the example from notebook 1:

Traceback (most recent call last):
File "sompy.py", line 7, in
import sompy as SOM
File "/home/kroham/sompy.py", line 27, in
sm = SOM.SOM('sm', Data, mapsize = [msz0, msz1],norm_method = 'var',initmethod='pca')
TypeError: 'module' object is not callable

Are the notebook examples still working?
Thanks in advance.

SOMPY for timeseries prediction

hello,

Would it be possible for you to extend the functionality of SOMPY such that it supports timeseries prediction? As an example model, see VQTAM:

Barreto GA (2007) Time Series Prediction with the Self-Organizing Map: A Review. Perspectives of Neural-Symbolic Integration, Studies in Computational Intelligence Volume 77: 135-158

I think this could be a powerful function mapping technique, especially in the online big-data setting.

many thanks
Gabriel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.