Code Monkey home page Code Monkey logo

orca's People

Contributors

durka avatar javism avatar mperezortiz avatar pagutierrez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

orca's Issues

Parameters processing can confuse parameters with similar name

Parameters processing can be unstable depending of the parameters name that the user choose. Now, if we have two different parameters algorithm and algorithmDefault any of them can be randomly choose to be assigned to obj.method. The issue comes from Experiment.m:

elseif strncmpi('algorithm',nueva_linea, 3),

Causes to compare only using the first lines. A quick fix can be to use length(), so that:

elseif strncmpi('algorithm',nueva_linea, length('algorithm')),

issues testing from octave

Hi,

I'm running orca from Octave installed from ubuntu 18.10.

Installation goes fine but when I try to run the tests (from Octave shell) I get the following errors.

warning: struct: converting a classdef object into a struct overrides the access restrictions defined for properties. All properties are returned, including private and protected ones.
warning: called from
    fieldnames at line 47 column 11
    parseArgs at line 114 column 29
    POM at line 56 column 13
    pomTest at line 7 column 14
    runtestssingle at line 35 column 5
panic: Segmentation fault -- stopping myself...
.........................
Performing test for POM
Accuracy Train 0.408889, Accuracy Test 0.333333
Test accuracy matchs reference accuracy
Processing redsvmTest.m...
attempting to save variables to 'octave-workspace'...
error: octave_base_value::save_binary(): wrong type argument 'object'

any clue on what is causing the error?

Windows port

Windows port related topics. ORCA is being ported to Windows. The methods that does not dependt of external C/C++ code should work out of the box. For the rest, some work has to be done using GCC and Make on Windows.

List of methods that works on Windows (R2017b):

  • CCSVC
  • ELMOR
  • KDLOR
  • OBBE
  • ORBOOST
  • POM
  • REDSVM
  • SVC1V1
  • SVC1VA
  • SVMOP
  • SVOREX
  • SVORIMLIN
  • SVORIM
  • SVR

Unified experiments ini

Hi! I have been trying to run an unified experiment file (ini format) with a mixture of classifiers (POM, KDLOR) and I always get the same error:

Setting up experiments...
Running experiment exp-kdlor-amae-tormentas-1.ini
Running experiment exp-pom-tormentas-1.ini
error: Error 'C' is not a recognized class parameter name
error: called from
    paramopt>checkParameters at line 135 column 9
    paramopt at line 28 column 1
    crossValideParams at line 180 column 22
    run at line 68 column 26
    launch at line 55 column 13
    runExperiments at line 90 column 25

This is the configuration file:

[pom]
{general-conf}
seed = 1
basedir = ../exampledata/1-holdout/
datasets = tormentas
standarize = true

{algorithm-parameters}
algorithm = POM

[kdlor-amae]
{general-conf}
seed = 1
basedir = ../exampledata/1-holdout/
datasets = tormentas
standarize = true
num_folds = 5
cvmetric = amae

{algorithm-parameters}
algorithm = KDLOR
kernelType = rbf

{algorithm-hyper-parameters-to-cv}
C = 10.^(-3:1:3)
k = 10.^(-3:1:3)
u = 0.01,0.001,0.0001,0.00001,0.000001

It seems that POM is trying to use the hyper-parameters for KDLOR.

Thanks a lot!

Update or supress bash Makefiles

We need to update bash Makefiles to match the rules in Matlab/Octave set of make.m. However, I'd suggest to suppress bash Makefiles since they can be confusing for the user because the user need to propertly setup a set the environment variables to point to Matlab/Octave install dir.

Refactor predict()

Algorithm's classification method is obj.predict(patterns, model). Following OOP convention, the model should be stored in obj.model, there allowing obj.predict(patterns). However this would affect to ensemble models and binary decomposition methods, since there use to be several models and that methods reuse predict several times to perform the prediction (see OPBE).

The changes can be done, but are not straightforward.

ORCA not listed on the octave package index

Dear maintainers,

I noticed your package does not appear on the octave package index. Please consider adding it! You can find the octave package index and instructions how to add your package here: https://gnu-octave.github.io/packages/

Ideally you should create a package in the octave package format, that could be installed via octave's pkg command; however, this no longer a strict requirement, and packages with custom installation instructions are also accepted on the index, as long as they provide clear installation instructions. :)

Bug in parallel processing

After doing a parallel run of tests:

Parallel pool using the 'local' profile is shutting down.
Calculating results...
Undefined function or variable
'myExperiment'.

Error in Utilities.runExperiments (line
100)
            Utilities.results([logsDir '/'
            'Results'],'report_sum',
            myExperiment.report_sum,
            'train', true);

Error in runtestscv (line 38)
    exp_dir =
    Utilities.runExperiments([tests_dir '/'
    files(i).name], 'parallel', true);

hello,I ask one question

hello,I am a student,and I run the code of ,but for some dataset , show Matrix dimensions don't agree in Utilities class ,results method:
cm = confusionmat(act{h},pred{h});
cm_sum = cm_sum + cm;
I find the question is confusionmat function, use it to get cm, the Matrix dimensions don't agree with cm_sum.
can you give me some advise?

Installation under Octave

Hi,

I have Octave installed but when compiling the sources (I'm following the install guide), the Makefile in src/Algorithms assumes that I have Matlab installed, resulting in the following error:

Folder /usr/local/MATLAB/R2017a/ does not exist. Please, set up MATLABDIR propertly
false
Makefile:18: recipe for target '/usr/local/MATLAB/R2017a/' failed
make: *** [/usr/local/MATLAB/R2017a/] Error 1

Parallel running of experiments

Issues related to parallelisation of experiments:

  • matlabpool is removed in recent versions of matlab: add compatibility between versions
  • test compatibility with Octave

Framework instalation and build

There are some pending task related to ORCA installation. The installation is done with Makefile (Linux) or make() function (Linux/Windows).

Complete build/clean from src/Algorithms folder:

  • Makefile Linux Matlab
  • Makefile Linux Octave
  • make() Linux Matlab
  • make() Linux Octave
  • make() Windows Matlab
  • make() Windows Octave

Clean of objects:

  • Makefile Linux Matlab/Octave
  • make() Linux Matlab
  • make() Linux Octave
  • make() Windows Matlab
  • make() Windows Octave

Clean all (objects + executables). This is useful when using several versions of matlab or octave.

  • Makefile Linux Matlab/Octave
  • make() Linux Matlab
  • make() Linux Octave
  • make() Windows Matlab
  • make() Windows Octave

Unit tests

  • Unit tests. Coverage test for all the algorithms and experiments.

Compatibility between versions

Some functions are deprecated depending on Matlab's version. Examples are:

Warning: The RandStream.setDefaultStream static method will be removed in a future release. Use
RandStream.setGlobalStream instead.
But also de ones related to optimin for KDLOR

To propertly fix this we need:

  • To better detect version (improve regular expressions at KDLOR.m)
  • To update code according to version
  • Add makefiles to ensure mex compatibility

Add datasets with description

  • Add ordinal regression datasets including data properties
  • Rename 'gpor' to 'matlab' in datasets and scripts
  • Add real problems datasets description

Avoid using combvec

The code only uses the function combvec from the nnet toolbox (in the file Experiment.m). However, it could be easilly replaced by:

  • Link1
  • Link2
    We should do it to reduce dependencies and future problems with Octave.

Handling folds execution failure

If a methods fails to end a fold experiment (example fold 15), the results table is build without any notification to the user.

The second issue, is that in the report file the folds rows are secuentially numbered, so that if fold 15 fails, 'dataset/results_test.csv', it still appearing (ex. test_dermatology.15) , and last identifier of experiment is suppressed (test_dermatology.9).

Incoherent output when disabling cv

Hi everyone,

I am using the orca library and am trying to run the SVORIM algorithm on my own data. I have allready crossvalidated it and I want to disable this. Therefore, I am using the following .ini file:

`;SVORIM experiments
; Experiment ID
[test]
{general-conf}
seed = 1
; Datasets path
basedir = ../data
; Datasets to process (comma separated list or all to process all)
datasets = test1981
; Activate data standardization
standarize = false
; Number of folds for the parameters optimization
;num_folds = 0
; Crossvalidation metric
cvmetric = mae

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = SVORIM
;kernelType = rbf

; Method's hyper-parameter values
{algorithm-hyper-parameters}
C = 1000
k = 0.01
`

Unfortunately, my predictions are now an incoherent mess of symbols such as: "ਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲਲ". When I leave cv enabled I don't have this problem. However, running this code with cv takes over a day and since I have to run it a number of times, without cv is preferred. Is my method of disabling cv in the ini file incorrect of is something else happening that is causing this?

Finally, the "results_test.csv" and "results_train.csv" files are created correctly with data in them that seems to be correct (allthough it has C = 0.1 and k =0.1 instead of C = 1000 and k = 0.01, which is also strange). I hope you can help and thanks in advance!

Indentation, comments and variables naming

We need to prettify the code:

  • Code indentation is not consistent through the files
  • All classes and methods description have to match MATLAB's comments style
  • Some variables names are in Spanish

Include software binaries

Provide binaries in case compilation fails or to allow the use of the software in environments without a suitable compiler. ¿Should we provide 32bits binaries?

  • Linux Matlab binaries
  • Linux Octave binaries
  • Windows Matlab binaries
  • Windows Octave binaries

REDSVM - Possible memory leak

Migrating REDSVM to ORCA-Python i detected a memory leak during the execution of the algorithm. The problem looks like is this part of the svm_free_model_content function:

if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL)
free((void *)(model_ptr->SV[0]));

This code only free the memory of the first SV but not the rest of them. Changing that to:

if(model_ptr->free_sv && model_ptr->l > 0 && model_ptr->SV != NULL){
	for(int i=0;i<model_ptr->l;i++)
			free((void *)(model_ptr->SV[i]));
}

Solved the problem for me.

All the parameters have should be configured

All the parameters have should be configured and passed to the runAlgorithm method as variable arguments.
The type of the parameters should be inferred from the default values of the "parameters" structure.

test failure in ORBoost and SVORIM

After making a few edits to get the toolbox compiling on macOS (see #45), I got these errors when running runtestssingle:

ORBoost

Index exceeds matrix dimensions.

Error in ORBoost/privpredict (line 123)
            predicted = all(:,1);

Error in Algorithm/predict (line 80)
            [projected, predicted]= privpredict(obj,test);

Error in ORBoost/privfit (line 84)
            [projectedTrain,predictedTrain] = obj.predict(train.patterns);

Error in Algorithm/fit (line 65)
            [projectedTrain, predictedTrain] = obj.privfit(train, param);

Error in Algorithm/runAlgorithm (line 33)
            [mInf.projectedTrain, mInf.predictedTrain] = obj.fit(train,param);

Error in orboostTest (line 13)
info = algorithmObj.runAlgorithm(train,test);

Error in runtestssingle (line 37)
        eval(cmd(1:end-2))

SVORIM

Error using svorimTest (line 35)
Test accuracy does NOT match reference accuracy

Error in runtestssingle (line 37)
        eval(cmd(1:end-2))

[bug] In DataSet.standarizeData

in DataSet.standarizeFunction (line 106) XStds = std(X) operates across columns, not rows.

Example:
>> X = [1 2 3; 4 5 6]

X =

 1     2     3
 4     5     6

>> std(X)

ans =

2.1213    2.1213    2.1213`

matlab version: 9.6.0.1072779 (R2019a)

Suggested solution: change line 106 to XStds = std(X.')

Homogenize algorithms API

Some algorithms present inconsistent API. For instance POM receives a matrix with patterns in test method, instead of the dataset structure.

Continuous integration

The software is not under continuous integration. We can integrate octave with travis.

Framework basic tests

This is mandatory to verify code correctness and ease further code improvement. Task ordered by priority. First two ones are basis to perform installation tests.

  • Method level. For each method, create a test for base functionality. The test consist on predefined hyper-parameters and a test dataset with known reference performance.
  • Experiments script level test. We have to check that code executions of example experiments ends correctly. Because of non-determinism behavior of hyper-parameters optimization initially we do not consider reference performance.

Tutorials

  • Normal use through matlab
  • Use for paralellization (condor, parfor)
  • Getting started (git clone, compilation...)

POM improvements

  • Include more link functions
  • Rewrite predict() to use mnrval()

SVOREX - Segmentation Fault

First of all, I'm running ORCA on Matlab R2018a.

I've been crossvalidating SVOREX with a big set of parameters. At some point (i.e. with a specific combination of parameters detailed below), SVOREX has returned a segmentation fault with the following error description:

Warning: KKT conditions are violated on bias!!! -0.101231 with C=1.000 K=0.001 Segmentation Fault

Up to my knowledge, this comes from the following lines, in the smo_routine.c (included in SVOREX folder):

if (settings->bmu_low[loop-1] - settings->bmu_up[loop-1]>TOL){
	printf("Warning: KKT conditions are violated on bias!!! %f with C=%.3f K=%.3f\r\n",
	settings->bmu_low[loop-1] + settings->bmu_up[loop-1], VC, KAPPA);
	exit(1);
}

In my case, by removing the exit(1); line, the code works successfully, however, I could be omitting any criterion that must be satisfied.

The dataset (patterns and labels of both train and test) is attached to this issue. The algorithm is SVOREX, and the parameter combination is: C=1.000 -- K=0.001.

Dataset.zip

Improve addpath in methods using C code

Methods using C code such as SVORIM, SVOREX... perform addpath only in runAlgorithm. However, if the train/test methods are called there is an error since addpath is only added in runAlgorithm.

Potential solutions are:

  • Place addpath in the constructor and rmpath in the destructor (more general)
  • Add addpath/rmpath in train and test methods.

Flag 'all' in .ini file

Hi!

When I try to run an experiment of 20 datasets with the option datasets = all, only first dataset in alphabetical order is considered. I have to write the names of the 20 datasets separated by comma in order to run the experiment properly.

The configuration file with dataset = all option:

[test]
{general-conf}
seed = 1
report_sum = true

; Datasets path
basedir = ../../data/datasets/orca/5classes/original/

; Datasets
datasets = all

; Standardization
standarize = true

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = POM

Returns:

Setting up experiments...
Running experiment exp-test-BTC-AR-1.ini
Calculating results...
Experiments/exp-2019-7-23-13-29-59/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-29-59/Results/BTC-AR-test/dataset
ans = Experiments/exp-2019-7-23-13-29-59

ORCA only use one dataset. However, when I list all of them in a comma separated list, the experiment runs correctly:

[test]
{general-conf}
seed = 1
report_sum = true

; Datasets path
basedir = ../../data/datasets/orca/5classes/original/

; Datasets
datasets = BTC-AR-trend-CNDL, ETH-AR-trend, ETH-AR-crypto-CNDL, ETH-AR-trend-crypto, BTC-AR-trend-crypto-CNDL, ETH-AR-CNDL, BTC-AR-CC-trend-CNDL, BTC-AR-CNDL, BTC-AR-CC-trend, ETH-AR-crypto, ETH-AR-trend-crypto-CNDL, BTC-AR-trend, ETH-AR-CC-trend, ETH-AR-trend-CNDL, BTC-AR-crypto-CNDL, BTC-AR-crypto, ETH-AR-CC-trend-CNDL, ETH-AR, BTC-AR, BTC-AR-trend-crypto

; Standardization
standarize = true

; Method: algorithm and parameter
{algorithm-parameters}
algorithm = POM

Returns:

Setting up experiments...
Running experiment exp-test-BTC-AR-1.ini
Running experiment exp-test-BTC-AR-CC-trend-1.ini
Running experiment exp-test-BTC-AR-CC-trend-CNDL-1.ini
Running experiment exp-test-BTC-AR-CNDL-1.ini
Running experiment exp-test-BTC-AR-crypto-1.ini
Running experiment exp-test-BTC-AR-crypto-CNDL-1.ini
Running experiment exp-test-BTC-AR-trend-1.ini
Running experiment exp-test-BTC-AR-trend-CNDL-1.ini
Running experiment exp-test-BTC-AR-trend-crypto-1.ini
Running experiment exp-test-BTC-AR-trend-crypto-CNDL-1.ini
Running experiment exp-test-ETH-AR-1.ini
Running experiment exp-test-ETH-AR-CC-trend-1.ini
Running experiment exp-test-ETH-AR-CC-trend-CNDL-1.ini
Running experiment exp-test-ETH-AR-CNDL-1.ini
Running experiment exp-test-ETH-AR-crypto-1.ini
Running experiment exp-test-ETH-AR-crypto-CNDL-1.ini
Running experiment exp-test-ETH-AR-trend-1.ini
Running experiment exp-test-ETH-AR-trend-CNDL-1.ini
Running experiment exp-test-ETH-AR-trend-crypto-1.ini
Running experiment exp-test-ETH-AR-trend-crypto-CNDL-1.ini
Calculating results...
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/BTC-AR-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CC-trend-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-CNDL-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-crypto-test/dataset
Experiments/exp-2019-7-23-13-38-56/Results/ETH-AR-trend-test/dataset
ans = Experiments/exp-2019-7-23-13-38-56

Im running the experiment from a jupyter notebook with Octave kernel (version 4.4.1) and macOS High Sierra 10.13.6

Abstract methods not available in Octave

Abstract methods are not available in Octave (see bug).

The solution right now is to comment those methods in the abstract classes. However the proper solution can be to define the methods as standard methods that trow an exception in the upper class so they can be only called if implemented in child classes. There we have a kind of interface with the tools we have in Matlab and Octave.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.