hpelm's Introduction

High Performance toolbox for Extreme Learning Machines.

Extreme learning machines (ELM) are a particular kind of Artificial Neural Networks, which solve classification and regression problems. Their performance is comparable to a classical Multilayer Perceptron trained with Error Back-Propagation algorithm, but the training time is up to 6 orders of magnitude smaller. (yes, a million times!)

ELMs are suitable for processing huge datasets and dealing with Big Data, and this toolbox is created as their fastest and most scalable implementation.

Documentation is available here:, it uses Numpydocs.

NEW: Parallel HP-ELM tutorial! See the documentation:

  • Efficient matrix math implementation without bottlenecks
  • Efficient data storage (HDF5 file format)
  • Data size not limited by the available memory
  • GPU accelerated computations (if you have one)
  • Regularization and model selection (for in-memory models)
Main classes:
  • hpelm.ELM for in-memory computations (dataset fits into RAM)
  • hpelm.HPELM for out-of-memory computations (dataset on disk in HDF5 format)
Example usage::
>>> from hpelm import ELM
>>> elm = ELM(X.shape[1], T.shape[1])
>>> elm.add_neurons(20, "sigm")
>>> elm.add_neurons(10, "rbf_l2")
>>> elm.train(X, T, "LOO")
>>> Y = elm.predict(X)

If you use the toolbox, cite our open access paper "High Performance Extreme Learning Machines: A Complete Toolbox for Big Data Applications" in IEEE Access.

@ARTICLE{7140733, author={Akusok, A. and Bj"{o}rk, K.-M. and Miche, Y. and Lendasse, A.}, journal={Access, IEEE}, title={High-Performance Extreme Learning Machines: A Complete Toolbox for Big Data Applications}, year={2015}, volume={3}, pages={1011-1025}, doi={10.1109/ACCESS.2015.2450498}, ISSN={2169-3536}, month={},}

hpelm's Issues

Confusion matrix and multiclass model


  • I get this error message when using confusion() function: "Confusion matrix works only for regression"
    How can I compute confusion matrix for multiclass classification?
    -When defining the hpelm object like this:
    model= hpelm.ELM(inputs=50 , outputs=6, classification='mc') It gives me an error message saying:
    "T has wrong dimensionality: expected classes 6, found 1"
    Although I have 6 classes, when I change outputs to 1, it works fine and computes the model error.
    Any explanation please, I would be grateful!


Printing the network structure after training

For example, if i started with 50 linear nodes and 100 sigmoid nodes, train with 'LOO' and 'OP', then unnecessary nodes will be dropped during the training.

So is it possible to know how many nodes left ? Or verbose the dropping in each step ?

Plus is it possible to train with a fixed structure ? (by given that the best number of node has been found from other tests).

Thank you

Not compatible with Python 3.x

Unfortunately, this implementation is not compatible with the Python 3.x. I have tested it under version 3.5 and there are many problems which are needed to be fixed, like:

  • Strings are now indicated as str rather than basestring
  • The xrange has became a range
  • There is no more range which returns the list

And some more. I hope this will be fixed sometime.

multilayer elm

Hi, this may not be best place to ask but I am trying to train a multilayer ELM. Unfortunately, it produces results very similar to a single layer classification ELM, which is poorer than a simple random forest classifier. I copied how I configure it below. Do I miss anything? Thanks.
input shape is
(18356L, 7500L) max: 1.0 min: -1.0
ninputsig = 40
ninputlin = 3

elmInput = ELM(input_shape, input_shape)
elmInput.add_neurons(ninputsig, "sigm")
elmInput.add_neurons(ninputlin, "lin")
elmInput.train(XXtrainIn, XXtrainOut,"r") 
print "\n Trained input elm",elmInput
elmInputProjection= elmInput.project(XXtrainIn)    

elmHidden1 = ELM(elmInputProjection.shape[1], elmInputProjection.shape[1])
elmHidden1.add_neurons(ninputsig, "sigm")
elmHidden1.add_neurons(ninputlin, "lin")
elmHidden1.train(elmInputProjection, XXtrainOut,"r")
elmHiddenProjection= elmHidden1.project(elmInputProjection)

elmOutput = ELM(elmHiddenProjection.shape[1], YY.shape[1])
elmOutput.add_neurons(ninputsig, "sigm")
elmOutput.add_neurons(ninputlin, "lin")

# now I should bring all samples here. 
inputproj = elmInput.project(XXtrainIn)
hiddenproj = elmHidden1.project(inputproj)
elmOutput.train(hiddenproj, YYtrain,"wc", w=[0.5, 0.5])

# now prediction for trainining
prjinput = elmInput.project(XXtrainIn)
prjhidden = elmHidden1.project(prjinput)
youtput = elmOutput.predict(prjhidden)

# training results
mse_error = elmOutput.error(youtput, YYtrain)
print "Training mse:", mse_error
p = ytrain.squeeze()
yout = np.argmax(youtput, axis = 1)
nhit = sum(yout==p)
ntpos = sum((yout==1) & (p==1))
npos = sum((p==1))
print "\n Training results"
print "Tpos:",ntpos," / ", npos, "TD:", ntpos/float(npos)
print "Acc: ", nhit/(float)(len(p)), "total", len(p)

Find the number of neurons after pruning


Is there anyway we can find the total number of neurons that will be used after pruning is performed using OP-ELM or P-ELM? Which function do we have to use to find the total number of neurons left after pruning? Appreciate it very much if you can kindly help on this. Thank you.


I cannot import hpelm anymore

I cannot import hpelm. Anytime I try, the following error appears:

Traceback (most recent call last):
File "c:/Users/jabel/OneDrive - Universidad de Córdoba/Scripts_python/et0-reg-private/", line 13, in
import lib.ml_models.elm_regression as ELM
File "c:\Users\jabel\OneDrive - Universidad de Córdoba\Scripts_python\et0-reg-private\lib\ml_models\", line 12, in
import hpelm
File "C:\Users\jabel.conda\envs\agroml\lib\site-packages\", line 7, in
from .elm import ELM
File "C:\Users\jabel.conda\envs\agroml\lib\site-packages\hpelm\", line 11, in
from tables import open_file
File "C:\Users\jabel.conda\envs\agroml\lib\site-packages\", line 45, in
from .utilsextension import get_hdf5_version as _get_hdf5_version
ImportError: DLL load failed while importing utilsextension: No se puede encontrar el módulo especificado.


My conda environment is the following:

Name Version Build Channel

absl-py 0.15.0 pypi_0 pypi
asttokens 2.0.5 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
atomicwrites 1.4.0 pypi_0 pypi
attrs 21.4.0 pypi_0 pypi
backcall 0.2.0 pypi_0 pypi
ca-certificates 2021.10.26 haa95532_4
cachetools 5.0.0 pypi_0 pypi
certifi 2021.10.8 py38haa95532_2
charset-normalizer 2.0.12 pypi_0 pypi
colorama 0.4.4 pypi_0 pypi
cycler 0.11.0 pypi_0 pypi
debugpy 1.5.1 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
entrypoints 0.4 pypi_0 pypi
ephem 4.1.3 pypi_0 pypi
et-xmlfile 1.1.0 pypi_0 pypi
executing 0.8.2 pypi_0 pypi
fasteners 0.17.3 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
fonttools 4.29.1 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
google-auth 2.6.0 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.34.1 pypi_0 pypi
h5py 3.1.0 pypi_0 pypi
hpelm 1.0.10 pypi_0 pypi
icecream 2.1.2 pypi_0 pypi
idna 3.3 pypi_0 pypi
importlib-metadata 4.11.1 pypi_0 pypi
iniconfig 1.1.1 pypi_0 pypi
ipykernel 6.9.1 pypi_0 pypi
ipython 8.1.1 pypi_0 pypi
jedi 0.18.1 pypi_0 pypi
joblib 1.1.0 pypi_0 pypi
jupyter-client 7.1.2 pypi_0 pypi
jupyter-core 4.9.2 pypi_0 pypi
keras-nightly 2.5.0.dev2021032900 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.3.2 pypi_0 pypi
markdown 3.3.6 pypi_0 pypi
matplotlib 3.5.1 pypi_0 pypi
matplotlib-inline 0.1.3 pypi_0 pypi
nest-asyncio 1.5.4 pypi_0 pypi
nose 1.3.7 pypi_0 pypi
numexpr 2.8.1 pypi_0 pypi
numpy 1.19.5 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
openpyxl 3.0.9 pypi_0 pypi
openssl 1.1.1m h2bbff1b_0
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pypi_0 pypi
pandas 1.4.1 pypi_0 pypi
parso 0.8.3 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 9.0.1 pypi_0 pypi
pip 21.2.2 py38haa95532_0
pluggy 1.0.0 pypi_0 pypi
prompt-toolkit 3.0.28 pypi_0 pypi
protobuf 3.19.4 pypi_0 pypi
pure-eval 0.2.2 pypi_0 pypi
py 1.11.0 pypi_0 pypi
pyaml 21.10.1 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pygments 2.11.2 pypi_0 pypi
pyparsing 3.0.7 pypi_0 pypi
pytest 7.0.1 pypi_0 pypi
python 3.8.12 h6244533_0
python-dateutil 2.8.2 pypi_0 pypi
pytz 2021.3 pypi_0 pypi
pywin32 303 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
pyzmq 22.3.0 pypi_0 pypi
requests 2.27.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.8 pypi_0 pypi
scikit-learn 1.0.2 pypi_0 pypi
scikit-optimize 0.9.0 pypi_0 pypi
scipy 1.8.0 pypi_0 pypi
setuptools 58.0.4 py38haa95532_0
six 1.15.0 pypi_0 pypi
sqlite 3.37.2 h2bbff1b_0
stack-data 0.2.0 pypi_0 pypi
tables 3.7.0 pypi_0 pypi
tensorboard 2.8.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.5.0 pypi_0 pypi
tensorflow-estimator 2.5.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tomli 2.0.1 pypi_0 pypi
tornado 6.1 pypi_0 pypi
tqdm 4.62.3 pypi_0 pypi
traitlets 5.1.1 pypi_0 pypi
typing-extensions pypi_0 pypi
urllib3 1.26.8 pypi_0 pypi
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
wcwidth 0.2.5 pypi_0 pypi
werkzeug 2.0.3 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
wincertstore 0.2 py38haa95532_2
wrapt 1.12.1 pypi_0 pypi
xlwt 1.3.0 pypi_0 pypi
zipp 3.7.0 pypi_0 pypi


I have try to run it on VSCode and Pycharm. I also uninstall and reinstall (even a different version).

Any clue about what is happening and how to resolve it? Because in other computers it is actually working.

Dimensionality of input/output data parameters annoying

Could we remove dimensionality of input/output data constraints because it's really bothering me when I'm trying to train same network with arbitrary length of data.

I'm talking about the simple functionality like in this project:

I'm suffering this problem in here:

I don't want to limit the length of input. In ELM.jl project output is also arbitrary and depends on the length of the input. I can crop the output or fill with zeros but I don't want to limit the input length.

Do you have any idea? Thanks a lot for this great library.

Add more submodules

These submodules could be used in other application like inversion, clustering etc

calculate cross validation accuracy

I am trying to do 10 fold cross-validation how to calculate each fold accuracy and average accuracy...

The code is given below

import numpy as np
import os
from hpelm import ELM

curdir = os.path.dirname(file)
pX = os.path.join(curdir, "data/data.txt")
pY = os.path.join(curdir, "data/class-labelonehot.txt")

X = np.loadtxt(pX)
Y = np.loadtxt(pY)

elm = ELM(21,3)
elm.add_neurons(100, "sigm")
elm.train(X, Y, "CV", k=10)
yytest =elm.predict(X)

Classification mode 'c' is overwritten by 'None'

When I initialize a new ELM object, I set its 'classification' attribute to 'c', so I expect to obtain a regular classification error when calling ELM.error(). However, I noticed it is overwritten by 'None' in _train_parse_args, thus returning regression error.

A workaround is to set 'classification' to 'c' again when calling ELM.train(), but I think the desired behavior would be to keep the 'classification' attribute chosen when initializing. If ELM.train() sets a new 'classification' value, then ok, it should be overwritten, but it should not be overwritten to 'None' by default.

Weird error when 'OP' argument is set

When calling ELM.train(data, labels, "CV", "OP", 'c', k=5) on the data and labels attached, I get the following error:

Traceback (most recent call last):
File "", line 81, in
elm.train(data, labels, "CV", 'OP', 'c', k=5)
File "/usr/local/lib/python3.5/dist-packages/hpelm/", line 198, in train
e = train_cv(self, X, T, k)
File "/usr/local/lib/python3.5/dist-packages/hpelm/", line 53, in train_cv
rank, L = self._ranking(Hvl.shape[1], Hvl, Tvl)
File "/usr/local/lib/python3.5/dist-packages/hpelm/", line 503, in _ranking
rank = mrsr2(H, T, L)
File "/usr/local/lib/python3.5/dist-packages/hpelm/modules/", line 66, in mrsr2
AttributeError: 'range' object has no attribute 'remove'

I couldn't figure out how to solve it, unless by removing "OP" flag. The flag works for another data file, much smaller than this one.

Warning when training with too many neurons

Whenever I am training the HPELM with large numbers of neurons (when SVD is used), I get the following warning:
/Applications/anaconda2/envs/py3/lib/python3.6/site-packages/hpelm/nnets/ FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1.
B = np.linalg.lstsq(HH, HT)[0]

This is not any problem for me, I just thought to draw your attention to this, as it may cause problems in the future.

