ayrna / orca Goto Github PK

View Code? Open in Web Editor NEW

111.0 9.0 34.0 10.67 MB

Ordinal Regression and Classification Algorithms

Home Page: http://www.uco.es/grupos/ayrna/orreview

License: GNU General Public License v3.0

MATLAB 98.73% Makefile 0.38% Shell 0.89%

machine-learning ordinal-classification ordinal-regression support-vector-machine matlab octave

orca's People

Contributors

Stargazers

Watchers

orca's Issues

Homogenize API regarding model saving

Threshold shape
Projection shape
Other parameters

Parameters processing can confuse parameters with similar name

Parameters processing can be unstable depending of the parameters name that the user choose. Now, if we have two different parameters algorithm and algorithmDefault any of them can be randomly choose to be assigned to obj.method. The issue comes from Experiment.m:

elseif strncmpi('algorithm',nueva_linea, 3),

Causes to compare only using the first lines. A quick fix can be to use length(), so that:

elseif strncmpi('algorithm',nueva_linea, length('algorithm')),

issues testing from octave

Hi,

I'm running orca from Octave installed from ubuntu 18.10.

Installation goes fine but when I try to run the tests (from Octave shell) I get the following errors.

warning: struct: converting a classdef object into a struct overrides the access restrictions defined for properties. All properties are returned, including private and protected ones.
warning: called from
    fieldnames at line 47 column 11
    parseArgs at line 114 column 29
    POM at line 56 column 13
    pomTest at line 7 column 14
    runtestssingle at line 35 column 5
panic: Segmentation fault -- stopping myself...
.........................
Performing test for POM
Accuracy Train 0.408889, Accuracy Test 0.333333
Test accuracy matchs reference accuracy
Processing redsvmTest.m...
attempting to save variables to 'octave-workspace'...
error: octave_base_value::save_binary(): wrong type argument 'object'

any clue on what is causing the error?

Avoid hardcoded base model in OPBE

Avoid hardcoded base model in OPBE

Windows port

Windows port related topics. ORCA is being ported to Windows. The methods that does not dependt of external C/C++ code should work out of the box. For the rest, some work has to be done using GCC and Make on Windows.

List of methods that works on Windows (R2017b):

Unified experiments ini

Hi! I have been trying to run an unified experiment file (ini format) with a mixture of classifiers (POM, KDLOR) and I always get the same error:

Setting up experiments...
Running experiment exp-kdlor-amae-tormentas-1.ini
Running experiment exp-pom-tormentas-1.ini
error: Error 'C' is not a recognized class parameter name
error: called from
    paramopt>checkParameters at line 135 column 9
    paramopt at line 28 column 1
    crossValideParams at line 180 column 22
    run at line 68 column 26
    launch at line 55 column 13
    runExperiments at line 90 column 25

This is the configuration file:

[pom]
{general-conf}
seed = 1
basedir = ../exampledata/1-holdout/
datasets = tormentas
standarize = true

{algorithm-parameters}
algorithm = POM

[kdlor-amae]
{general-conf}
seed = 1
basedir = ../exampledata/1-holdout/
datasets = tormentas
standarize = true
num_folds = 5
cvmetric = amae

{algorithm-parameters}
algorithm = KDLOR
kernelType = rbf

{algorithm-hyper-parameters-to-cv}
C = 10.^(-3:1:3)
k = 10.^(-3:1:3)
u = 0.01,0.001,0.0001,0.00001,0.000001

It seems that POM is trying to use the hyper-parameters for KDLOR.

Thanks a lot!

Update or supress bash Makefiles

We need to update bash Makefiles to match the rules in Matlab/Octave set of make.m. However, I'd suggest to suppress bash Makefiles since they can be confusing for the user because the user need to propertly setup a set the environment variables to point to Matlab/Octave install dir.

Refactor predict()

Algorithm's classification method is obj.predict(patterns, model). Following OOP convention, the model should be stored in obj.model, there allowing obj.predict(patterns). However this would affect to ensemble models and binary decomposition methods, since there use to be several models and that methods reuse predict several times to perform the prediction (see OPBE).

The changes can be done, but are not straightforward.

Upload solutions for tutorials

The following solutions should be uploaded:

Tutorial 1
Tutorial 2
Tutorial 3

ORCA not listed on the octave package index

Dear maintainers,

I noticed your package does not appear on the octave package index. Please consider adding it! You can find the octave package index and instructions how to add your package here: https://gnu-octave.github.io/packages/

Ideally you should create a package in the octave package format, that could be installed via octave's pkg command; however, this no longer a strict requirement, and packages with custom installation instructions are also accepted on the index, as long as they provide clear installation instructions. :)

Bug in parallel processing

After doing a parallel run of tests:

Parallel pool using the 'local' profile is shutting down.
Calculating results...
Undefined function or variable
'myExperiment'.

Error in Utilities.runExperiments (line
100)
            Utilities.results([logsDir '/'
            'Results'],'report_sum',
            myExperiment.report_sum,
            'train', true);

Error in runtestscv (line 38)
    exp_dir =
    Utilities.runExperiments([tests_dir '/'
    files(i).name], 'parallel', true);

Remove non-used optimizers for KDLOR

KDLOR only uses effectivelly MATLAB's quadprog.

Remove the remainder
Look for a free software QP optimnizer

hello,I ask one question

hello,I am a student,and I run the code of ,but for some dataset , show Matrix dimensions don't agree in Utilities class ,results method:
cm = confusionmat(act{h},pred{h});
cm_sum = cm_sum + cm;
I find the question is confusionmat function, use it to get cm, the Matrix dimensions don't agree with cm_sum.
can you give me some advise?

Test methods before and after the update

Correct problems with orboostall, svorex, svorim and svorimLin
Test the rest of methods

Installation under Octave

Hi,

I have Octave installed but when compiling the sources (I'm following the install guide), the Makefile in src/Algorithms assumes that I have Matlab installed, resulting in the following error:

Folder /usr/local/MATLAB/R2017a/ does not exist. Please, set up MATLABDIR propertly
false
Makefile:18: recipe for target '/usr/local/MATLAB/R2017a/' failed
make: *** [/usr/local/MATLAB/R2017a/] Error 1

Include source code of other methods

NNOP (java)
NNPOM (java)
GPOR (bash script)

Add algorithm recommendations (large scale, high dimensional...)

Parallel running of experiments

Issues related to parallelisation of experiments:

matlabpool is removed in recent versions of matlab: add compatibility between versions
test compatibility with Octave

Framework instalation and build

There are some pending task related to ORCA installation. The installation is done with Makefile (Linux) or make() function (Linux/Windows).

Complete build/clean from src/Algorithms folder:

Clean of objects:

Clean all (objects + executables). This is useful when using several versions of matlab or octave.

Unit tests

Unit tests. Coverage test for all the algorithms and experiments.

Incompatibility with MEX in R2015 and beyond versions

Compilation fails with recent versions of matlab. Reproduced with R2017b.

Compatibility between versions

Some functions are deprecated depending on Matlab's version. Examples are:

Warning: The RandStream.setDefaultStream static method will be removed in a future release. Use
RandStream.setGlobalStream instead.
But also de ones related to optimin for KDLOR

To propertly fix this we need:

To better detect version (improve regular expressions at KDLOR.m)
To update code according to version
Add makefiles to ensure mex compatibility

Add datasets with description

Add ordinal regression datasets including data properties
Rename 'gpor' to 'matlab' in datasets and scripts
Add real problems datasets description

The results corresponding to the sum of matrices only makes sense for certain types of partitions

The code should not calculate these results by default. Include a parameter to force it.

Avoid using combvec

The code only uses the function combvec from the nnet toolbox (in the file Experiment.m). However, it could be easilly replaced by:

Link1
Link2
We should do it to reduce dependencies and future problems with Octave.

Handling folds execution failure

If a methods fails to end a fold experiment (example fold 15), the results table is build without any notification to the user.

The second issue, is that in the report file the folds rows are secuentially numbered, so that if fold 15 fails, 'dataset/results_test.csv', it still appearing (ex. test_dermatology.15) , and last identifier of experiment is suppressed (test_dermatology.9).

Incoherent output when disabling cv

Hi everyone,

I am using the orca library and am trying to run the SVORIM algorithm on my own data. I have allready crossvalidated it and I want to disable this. Therefore, I am using the following .ini file:

`;SVORIM experiments
; Experiment ID
[test]
{general-conf}
seed = 1
; Datasets path
basedir = ../data
; Datasets to process (comma separated list or all to process all)
datasets = test1981
; Activate data standardization
standarize = false
; Number of folds for the parameters optimization
;num_folds = 0
; Crossvalidation metric
cvmetric = mae

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = SVORIM
;kernelType = rbf

; Method's hyper-parameter values
{algorithm-hyper-parameters}
C = 1000
k = 0.01
`

Unfortunately, my predictions are now an incoherent mess of symbols such as: "ਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲ". When I leave cv enabled I don't have this problem. However, running this code with cv takes over a day and since I have to run it a number of times, without cv is preferred. Is my method of disabling cv in the ini file incorrect of is something else happening that is causing this?

Finally, the "results_test.csv" and "results_train.csv" files are created correctly with data in them that seems to be correct (allthough it has C = 0.1 and k =0.1 instead of C = 1000 and k = 0.01, which is also strange). I hope you can help and thanks in advance!

Indentation, comments and variables naming

We need to prettify the code:

Code indentation is not consistent through the files
All classes and methods description have to match MATLAB's comments style
Some variables names are in Spanish

Include software binaries

Provide binaries in case compilation fails or to allow the use of the software in environments without a suitable compiler. ¿Should we provide 32bits binaries?

Linux Matlab binaries
Linux Octave binaries
Windows Matlab binaries
Windows Octave binaries

SVMOP performance differs in Windows

SVMOP has a different expected performance in svmopTest.m.

Linux: Accuracy Test 0.960000
Windows: Accuracy Test 0.920000

KDLOR in Octave

The 'qp' optimizator for KDLOR is not working in Octave.

REDSVM - Possible memory leak

Migrating REDSVM to ORCA-Python i detected a memory leak during the execution of the algorithm. The problem looks like is this part of the svm_free_model_content function:

orca/src/Algorithms/libsvm-rank-2.81/svm.cpp

Lines 3416 to 3417 in bd9d682

    
           if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL) 
        
           	free((void *)(model_ptr->SV[0]));

This code only free the memory of the first SV but not the rest of them. Changing that to:

if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL){
	for(int i=0;i<model_ptr->l;i++)
			free((void *)(model_ptr->SV[i]));
}

Solved the problem for me.

All the parameters have should be configured

All the parameters have should be configured and passed to the runAlgorithm method as variable arguments.
The type of the parameters should be inferred from the default values of the "parameters" structure.

test failure in ORBoost and SVORIM

After making a few edits to get the toolbox compiling on macOS (see #45), I got these errors when running runtestssingle:

ORBoost

Index exceeds matrix dimensions.

Error in ORBoost/privpredict (line 123)
            predicted = all(:,1);

Error in Algorithm/predict (line 80)
            [projected, predicted]= privpredict(obj,test);

Error in ORBoost/privfit (line 84)
            [projectedTrain,predictedTrain] = obj.predict(train.patterns);

Error in Algorithm/fit (line 65)
            [projectedTrain, predictedTrain] = obj.privfit(train, param);

Error in Algorithm/runAlgorithm (line 33)
            [mInf.projectedTrain, mInf.predictedTrain] = obj.fit(train,param);

Error in orboostTest (line 13)
info = algorithmObj.runAlgorithm(train,test);

Error in runtestssingle (line 37)
        eval(cmd(1:end-2))

SVORIM

Error using svorimTest (line 35)
Test accuracy does NOT match reference accuracy

Error in runtestssingle (line 37)
        eval(cmd(1:end-2))

Sort out exampledata (remove old folders)

RunAlgorithm refactor to fitpredict

[bug] In DataSet.standarizeData

in DataSet.standarizeFunction (line 106) XStds = std(X) operates across columns, not rows.

Example:
>> X = [1 2 3; 4 5 6]

X =

 1     2     3
 4     5     6

>> std(X)

ans =

2.1213    2.1213    2.1213`

matlab version: 9.6.0.1072779 (R2019a)

Suggested solution: change line 106 to XStds = std(X.')

Homogenize algorithms API

Some algorithms present inconsistent API. For instance POM receives a matrix with patterns in test method, instead of the dataset structure.

Continuous integration

The software is not under continuous integration. We can integrate octave with travis.

Framework basic tests

This is mandatory to verify code correctness and ease further code improvement. Task ordered by priority. First two ones are basis to perform installation tests.

Method level. For each method, create a test for base functionality. The test consist on predefined hyper-parameters and a test dataset with known reference performance.
Experiments script level test. We have to check that code executions of example experiments ends correctly. Because of non-determinism behavior of hyper-parameters optimization initially we do not consider reference performance.

Include examples in tutorials in tests

Documentation and tutorials are maintained as .md files and are not automatically tested. Basically the two alternatives will be Sphinx or Notebooks, but perhaps the later is more suitable for the tutorials.

How to Matlab and Jupyter:
https://am111.readthedocs.io/en/latest/jmatlab_install.html
https://am111.readthedocs.io/en/latest/jmatlab_use.html

Tutorials

Normal use through matlab
Use for paralellization (condor, parfor)
Getting started (git clone, compilation...)

POM improvements

Include more link functions
Rewrite predict() to use mnrval()

SVOREX - Segmentation Fault

First of all, I'm running ORCA on Matlab R2018a.

I've been crossvalidating SVOREX with a big set of parameters. At some point (i.e. with a specific combination of parameters detailed below), SVOREX has returned a segmentation fault with the following error description:

Warning: KKT conditions are violated on bias!!! -0.101231 with C=1.000 K=0.001 Segmentation Fault

Up to my knowledge, this comes from the following lines, in the smo_routine.c (included in SVOREX folder):

if (settings->bmu_low[loop-1] - settings->bmu_up[loop-1]>TOL){
	printf("Warning: KKT conditions are violated on bias!!! %f with C=%.3f K=%.3f\r\n",
	settings->bmu_low[loop-1] + settings->bmu_up[loop-1], VC, KAPPA);
	exit(1);
}

In my case, by removing the exit(1); line, the code works successfully, however, I could be omitting any criterion that must be satisfied.

The dataset (patterns and labels of both train and test) is attached to this issue. The algorithm is SVOREX, and the parameter combination is: C=1.000 -- K=0.001.

Dataset.zip

Improve addpath in methods using C code

Methods using C code such as SVORIM, SVOREX... perform addpath only in runAlgorithm. However, if the train/test methods are called there is an error since addpath is only added in runAlgorithm.

Potential solutions are:

Place addpath in the constructor and rmpath in the destructor (more general)
Add addpath/rmpath in train and test methods.

Flag 'all' in .ini file

Hi!

When I try to run an experiment of 20 datasets with the option datasets = all, only first dataset in alphabetical order is considered. I have to write the names of the 20 datasets separated by comma in order to run the experiment properly.

The configuration file with dataset = all option:

[test]
{general-conf}
seed = 1
report_sum = true

; Datasets path
basedir = ../../data/datasets/orca/5classes/original/

; Datasets
datasets = all

; Standardization
standarize = true

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = POM

Returns:

Setting up experiments...
Running experiment exp-test-BTC-AR-1.ini
Calculating results...
Experiments/exp-2019-7-23-13-29-59/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-29-59/Results/BTC-AR-test/dataset
ans = Experiments/exp-2019-7-23-13-29-59

ORCA only use one dataset. However, when I list all of them in a comma separated list, the experiment runs correctly:

[test]
{general-conf}
seed = 1
report_sum = true

; Datasets path
basedir = ../../data/datasets/orca/5classes/original/

; Datasets
datasets = BTC-AR-trend-CNDL, ETH-AR-trend, ETH-AR-crypto-CNDL, ETH-AR-trend-crypto, BTC-AR-trend-crypto-CNDL, ETH-AR-CNDL, BTC-AR-CC-trend-CNDL, BTC-AR-CNDL, BTC-AR-CC-trend, ETH-AR-crypto, ETH-AR-trend-crypto-CNDL, BTC-AR-trend, ETH-AR-CC-trend, ETH-AR-trend-CNDL, BTC-AR-crypto-CNDL, BTC-AR-crypto, ETH-AR-CC-trend-CNDL, ETH-AR, BTC-AR, BTC-AR-trend-crypto

; Standardization
standarize = true

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = POM

Returns:

Setting up experiments...
Running experiment exp-test-BTC-AR-1.ini
Running experiment exp-test-BTC-AR-CC-trend-1.ini
Running experiment exp-test-BTC-AR-CC-trend-CNDL-1.ini
Running experiment exp-test-BTC-AR-CNDL-1.ini
Running experiment exp-test-BTC-AR-crypto-1.ini
Running experiment exp-test-BTC-AR-crypto-CNDL-1.ini
Running experiment exp-test-BTC-AR-trend-1.ini
Running experiment exp-test-BTC-AR-trend-CNDL-1.ini
Running experiment exp-test-BTC-AR-trend-crypto-1.ini
Running experiment exp-test-BTC-AR-trend-crypto-CNDL-1.ini
Running experiment exp-test-ETH-AR-1.ini
Running experiment exp-test-ETH-AR-CC-trend-1.ini
Running experiment exp-test-ETH-AR-CC-trend-CNDL-1.ini
Running experiment exp-test-ETH-AR-CNDL-1.ini
Running experiment exp-test-ETH-AR-crypto-1.ini
Running experiment exp-test-ETH-AR-crypto-CNDL-1.ini
Running experiment exp-test-ETH-AR-trend-1.ini
Running experiment exp-test-ETH-AR-trend-CNDL-1.ini
Running experiment exp-test-ETH-AR-trend-crypto-1.ini
Running experiment exp-test-ETH-AR-trend-crypto-CNDL-1.ini
Calculating results...
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-test/dataset
ans = Experiments/exp-2019-7-23-13-38-56

Im running the experiment from a jupyter notebook with Octave kernel (version 4.4.1) and macOS High Sierra 10.13.6

Kernel type always RBF for SVOREX, SVORIM, SVC1v1...

The kernel type is always RBF for these methods, while it is configured...

Abstract methods not available in Octave

Abstract methods are not available in Octave (see bug).

The solution right now is to comment those methods in the abstract classes. However the proper solution can be to define the methods as standard methods that trow an exception in the upper class so they can be only called if implemented in child classes. There we have a kind of interface with the tools we have in Matlab and Octave.

ORBoost: textscan Unknown parameter 'bufsize'

This parameter is deprecated in newer MATLAB versions. Affect to
https://github.com/ayrna/orca/blob/master/src/Algorithms/ORBoost.m#L130

Update code to adapt the function call to MATLAB's version
Suppress prints of Iterations which generates 2000 new lines

	if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL)
	free((void *)(model_ptr->SV[0]));

ayrna / orca Goto Github PK

orca's People

Contributors

Stargazers

Watchers

Forkers

orca's Issues

ORBoost

SVORIM

Recommend Projects

Recommend Topics

Recommend Org