Code Monkey home page Code Monkey logo

random_forest_run's Introduction

RFR

A extensible C++ library for random forests with Python bindings with a BSD3 license.

Requirements

For the C++ library itself, you need no additional libaries, only a C++11 capable compiler. Technically, you need Boost if you want to compile the unit tests. The development is done using GCC 7.2. You probably have to set CMAKE_CXX_FLAGS to -std=c++11 when using older compilers.

CMAKE
DOXYGEN (if you want docstrings, which you probably do)
SWIG > 3.0

Installing the Python Bindings

We upload the latest version to PYPI, so you can install it via

pip install pyrfr

Development is done with Python 3.7-3.10 on Ubuntu and the unittests are executed via github actions. We do no longer support Python 2. Contact us if you experience any irregularities.

USAGE

For now, the file ./tests/pyrfr_unit_test_*.py inside the repository serve as the only real documentation of the Python bindings besides the docstrings.

random_forest_run's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

random_forest_run's Issues

Get all node values from all trees in rf

Hi!
I'm working on a research experiment, which requires from me to get the "max possible value", that the regression forest of yours can predict at the moment.

I've been trying to get that value using python api. I use RandomForestWithInstances in python.
It seems, there is no way I can get something like "all nodes from all trees", other than serializing the RandomForest to string or to tex and reading it.

model.rf.ascii_string_representation() gives me something like that.

Click to expand string representation. . .
{	
    "value0": {
        "value0": 10,
        "value1": 9,
        "value2": true,
        "value3": false,
        "value4": {
            "value0": 5,
            "value1": 20,
            "value2": 3,
            "value3": 2.0,
            "value4": 3,
            "value5": 1.0,
            "value6": 1048576,
            "value7": 1e-8,
            "value8": 1000.0,
            "value9": false
        }
    },
    "value1": [
        {
            "value0": [
                {
                    "value0": [],
                    "value1": [],
                    "value2": 0,
                    "value3": {
                        "value0": 1,
                        "value1": 2
                    },
                    "value4": {
                        "value0": 0.5555555555555556,
                        "value1": 0.4444444444444444
                    },
                    "value5": {
                        "value0": 4,
                        "value1": 0.41871805715741286,
                        "value2": {
                            "type": 0,
                            "data": 0
                        }
                    },
                    "value6": {
                        "value0": 0.0,
                        "value1": 0.0,
                        "value2": {
                            "value0": 0,
                            "value1": 0.0,
                            "value2": 0.0
                        }
                    }
                },	
	        // some more here ...
            ],
            "value1": 2,
            "value2": 1
        }
	],
	"value2": 6,
    "value3": [],
    "value4": NaN,
    "value5": [
        3,
        0,
        0,
        0,
        0,
        0
    ],
    "value6": [
        {
            "value0": 3.0,
            "value1": NaN
        },
        {
            "value0": 0.0,
            "value1": 1.0
        },
        {
            "value0": 0.0,
            "value1": 1.0
        },
        {
            "value0": 0.0,
            "value1": 1.0
        },
        {
            "value0": 0.0,
            "value1": 1.0
        },
        {
            "value0": -Infinity,
            "value1": Infinity
        }
    ]
}

What this value0 ... value6 are supposed to mean? I'm totally confused.

I tried to examine the c++ code from this repository, but InputArchive, JSONInputArchive, and other weird template structures seem to complicated and messy for me at the moment.

As far as a understood, I need to get std::vector<node_type> the_nodes; from the k_ary_random_tree somehow, and then get rfr::util::weighted_running_statistics<num_t> response_stat; from each node (k_ary_node, I suppose).

Can you please help me with this issue?
It can become a contribution to the code base, if I could understand how things work here.

Install fails in both python2 and python3

Greetings!

I'm trying to install pyrfr but I'm getting a gcc related error. My gcc version is:

gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)

When I run pip3 install --user pyrfr, I get the following output:

Collecting pyrfr
Using cached pyrfr-0.7.3.tar.gz
Installing collected packages: pyrfr
Running setup.py install for pyrfr ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-6jy9xu7f/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-txdrnv1a-record/install-record.txt --single-version-externally-managed --compile --user --prefix=:
running install
running build_ext
building 'pyrfr._regression' extension
swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/pyrfr
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I./include -I/usr/include/python3.6m -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-3.6/pyrfr/regression_wrap.o -O2 -std=c++11
gcc: error trying to exec 'cc1plus': execvp: No such file or directory
error: command 'gcc' failed with exit status 1

----------------------------------------

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-6jy9xu7f/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-txdrnv1a-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-6jy9xu7f/pyrfr/

I get a similar errror with python2.

docstrings for python interface without doxygen and cmake

At the moment, the python docstrings are generated via CMake and Doxygen.
The repo contains an empty docstrings.i for swig to compile the module, but without any docstrings.
Maybe, a static "docstrings.i" could solve that as well, but there needs to be a mechanism to keep the file automatically up-to-date.

Predict Mean and Variance

pyrfr.regression.predict_mean_var returns a rather useless <class 'SwigPyObject'> object. I would expect that this method returns a tuple or list containing mean and variance.

from pyrfr import regression

# Build data
data = regression.data_container()
data.import_csv_files("../test_data_sets/diabetes_features.csv", "../test_data_sets/diabetes_responses.csv")

# Build tree
rf = regression.binary_rss_forest()
rf.options.num_data_points_per_tree = data.num_data_points()
rng = regression.default_random_engine(1)
rf.fit(data, rng)

# Predict
pred = rf.predict_mean_var(data.retrieve_data_point(0))

Failure to Build, SWIG2.0, Ubuntu 16.04, GCC 5.4

cathal@thinkum:~$ pip3 install --user pyrfr
Collecting pyrfr
  Using cached pyrfr-0.4.0.tar.gz
Building wheels for collected packages: pyrfr
  Running setup.py bdist_wheel for pyrfr ... error
  Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1s2wffy7/pyrfr/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /tmp/tmp872ze6ycpip-wheel- --python-tag cp35:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.5
  creating build/lib.linux-x86_64-3.5/pyrfr
  copying pyrfr/__init__.py -> build/lib.linux-x86_64-3.5/pyrfr
  running build_ext
  building '_regression' extension
  swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
  swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
  ./include/rfr/trees/binary_fanova_tree.hpp:329: Error: Syntax error in input(3).
  error: command 'swig' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for pyrfr
  Running setup.py clean for pyrfr
Failed to build pyrfr
Installing collected packages: pyrfr
  Running setup.py install for pyrfr ... error
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1s2wffy7/pyrfr/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-ym37ai90-record/install-record.txt --single-version-externally-managed --compile --user --prefix=:
    running install
    running build_ext
    building '_regression' extension
    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
    swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    ./include/rfr/trees/binary_fanova_tree.hpp:329: Error: Syntax error in input(3).
    error: command 'swig' failed with exit status 1
    
    ----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1s2wffy7/pyrfr/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-ym37ai90-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-1s2wffy7/pyrfr/

Installation via pip

Installing the current version from pip fails as the following command does not work on all machines ;-)

swig -python -c++ -modern -features nondynamic -I/home/sfalkner/repositories/github/random_forest_run/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i

I assume the issue is due to line 24 in setup.py of the version on pypi

stdc++ header files installed in /usr/local/include ?

Hello,

I recently updated SMAC3 and random_forest_run. I cloned it from GitHub and the installation itself went fine. Only that I noticed that some headers were installed in my /usr/local/include and suddenly my other projects started to show compilation errors. Here are some examples:

/usr/local/include/inttypes.h:58:4: error: ‘intmax_t’ does not name a type
intmax_t quot;
^
/usr/local/include/inttypes.h:59:4: error: ‘intmax_t’ does not name a type
intmax_t rem;
^
/usr/local/include/inttypes.h:288:1: error: ‘_inline’ does not name a type
_inline

I solved this issue by renaming the /usr/local/include to /usr/local/include.old to see if that was the culprit and for now everything works again. I tested installing SMAC3 and random_forest_run in another machine and the same problem happened.
Is something being installed in this include folder by this package? And if it is, is it supposed to be installed like that?

Best,
Musa

Problem installing pyrfr

The error we discussed earlier. It occurs both when I install from the repo as from pip. I am using Ubuntu 16.04 and Python 3.5.2

(pimp) jan@jan-X202E:~/projects/pythonvirtual/pimp$ pip install git+https://github.com/automl/random_forest_run.git
Collecting git+https://github.com/automl/random_forest_run.git
  Cloning https://github.com/automl/random_forest_run.git to /tmp/pip-qug5_4di-build
Installing collected packages: pyrfr
  Running setup.py install for pyrfr ... error
    Complete output from command /home/jan/projects/pythonvirtual/pimp/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-qug5_4di-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-_7q08_80-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/jan/projects/pythonvirtual/pimp/include/site/python3.5/pyrfr:
    /home/jan/projects/pythonvirtual/pimp/lib/python3.5/site-packages/setuptools/dist.py:345: UserWarning: The version specified ('${RFR_VERSION_MAJOR}.${RFR_VERSION_MINOR}') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details.
      "details." % self.metadata.version
    running install
    running build_ext
    building '_regression' extension
    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
    swig -python -c++ -modern -features nondynamic -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    creating build
    creating build/temp.linux-x86_64-3.5
    creating build/temp.linux-x86_64-3.5/pyrfr
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/usr/include/python3.5m -I/home/jan/projects/pythonvirtual/pimp/include/python3.5m -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-3.5/pyrfr/regression_wrap.o -O2 -std=c++11
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    pyrfr/regression_wrap.cpp:171:21: fatal error: Python.h: No such file or directory
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
    ----------------------------------------
Command "/home/jan/projects/pythonvirtual/pimp/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-qug5_4di-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-_7q08_80-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/jan/projects/pythonvirtual/pimp/include/site/python3.5/pyrfr" failed with error code 1 in /tmp/pip-qug5_4di-build/

Is there any Unix package I should upgrade?

SWIG wheel building failed due to syntax error with SWIG 2.0

I'm in the process of getting a clean setup of auto-sklearn running on CentOS, and I'm getting an error building the wheel. I know there's quite a few issues about this already, but I've found the cause for at least one of the errors, which is the ./include/rfr/trees/binary_fanova_tree.hpp:329: Error: Syntax error in input(3). It has been encountered by other users before, see, e.g., #21, automl/auto-sklearn#314, #18.

It appears to happen in older versions of SWIG, which don't have all support required for C++11 and later versions. Either way, the error is as follows (solution can be found below):

running install
running build_ext
building '_regression' extension
swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
./include/rfr/trees/binary_fanova_tree.hpp:329: Error: Syntax error in input(3).
error: command 'swig' failed with exit status 1

As the message suggests, this is a problem with line 329 of https://github.com/automl/random_forest_run/blob/v0.6.0/include/rfr/trees/binary_fanova_tree.hpp#L329. Apparently, SWIG has some trouble handling the >>&: changing it to > >& solves the problem.

For users encountering this problem:
Use SWIG >= 3.0. That will also solve the syntax error.

Readme.md / RTD disagree

When trying to install pyrfr from source I noticed that the installation guides in Readme.md and RTD disagree. And it is unclear which one to use.

Also, there is probably a cd .. missing in the installation in Readme.md before python setup.py install --user and one would want to use git clone ... instead of git checkout ...

installation error

Installing the current master branch fails with the following error message:

~$ pip install git+https://github.com/automl/random_forest_run.git
Collecting git+https://github.com/automl/random_forest_run.git
Cloning https://github.com/automl/random_forest_run.git to /tmp/pip-2mqes4u9-build
Installing collected packages: pyrfr
Running setup.py install for pyrfr ... error
Complete output from command /home/kleinaa/virtualenv/debug/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-2mqes4u9-build/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-s574epp1-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/kleinaa/virtualenv/debug/include/site/python3.5/pyrfr:
/home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/setuptools/dist.py:350: UserWarning: The version specified ('${RFR_VERSION_MAJOR}.${RFR_VERSION_MINOR}') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details.
"details." % self.metadata.version
running install
running build_ext
building '_regression' extension
swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
swig -python -c++ -modern -features nondynamic -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
creating build
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/pyrfr
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/usr/include/python3.5m -I/home/kleinaa/virtualenv/debug/include/python3.5m -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-3.5/pyrfr/regression_wrap.o -O2 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from pyrfr/regression_wrap.cpp:3168:0:
./include/rfr/data_containers/default_data_container_with_instances.hpp: In instantiation of ‘void rfr::data_containers::default_container_with_instances<num_t, response_t, index_t>::check_consistency() [with num_t = double; response_t = double; index_t = unsigned int]’:
pyrfr/regression_wrap.cpp:18885:33: required from here
./include/rfr/data_containers/default_data_container_with_instances.hpp:216:11: warning: unused variable ‘t’ [-Wunused-variable]
index_t t = get_type_of_response();
^
creating build/lib.linux-x86_64-3.5
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/pyrfr/regression_wrap.o -o build/lib.linux-x86_64-3.5/_regression.cpython-35m-x86_64-linux-gnu.so
building '_util' extension
swigging pyrfr/util.i to pyrfr/util_wrap.cpp
swig -python -c++ -modern -features nondynamic -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/util_wrap.cpp pyrfr/util.i
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/usr/include/python3.5m -I/home/kleinaa/virtualenv/debug/include/python3.5m -c pyrfr/util_wrap.cpp -o build/temp.linux-x86_64-3.5/pyrfr/util_wrap.o -O2 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/pyrfr/util_wrap.o -o build/lib.linux-x86_64-3.5/_util.cpython-35m-x86_64-linux-gnu.so
running install_lib
running build_py
creating build/lib.linux-x86_64-3.5/pyrfr
copying pyrfr/util.py -> build/lib.linux-x86_64-3.5/pyrfr
copying pyrfr/init.py -> build/lib.linux-x86_64-3.5/pyrfr
copying pyrfr/regression.py -> build/lib.linux-x86_64-3.5/pyrfr
copying pyrfr/docstrings.i -> build/lib.linux-x86_64-3.5/pyrfr
copying build/lib.linux-x86_64-3.5/_util.cpython-35m-x86_64-linux-gnu.so -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages
copying build/lib.linux-x86_64-3.5/_regression.cpython-35m-x86_64-linux-gnu.so -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages
creating /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr
copying build/lib.linux-x86_64-3.5/pyrfr/util.py -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr
copying build/lib.linux-x86_64-3.5/pyrfr/init.py -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr
copying build/lib.linux-x86_64-3.5/pyrfr/docstrings.i -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr
copying build/lib.linux-x86_64-3.5/pyrfr/regression.py -> /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr
byte-compiling /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr/util.py to util.cpython-35.pyc
byte-compiling /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr/init.py to init.cpython-35.pyc
byte-compiling /home/kleinaa/virtualenv/debug/lib/python3.5/site-packages/pyrfr/regression.py to regression.cpython-35.pyc
running install_data
error: can't copy 'include': doesn't exist or not a regular file

It seems that this is due to the latest commit

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-0j3kh6cz-build/setup.py'

pip install git+git+https://github.com/automl/random_forest_run

gives me:

Collecting git+https://github.com/automl/random_forest_run.git (from -r all_requirements.txt (line 12))
  Cloning https://github.com/automl/random_forest_run.git to /tmp/pip-0j3kh6cz-build
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/travis/virtualenv/python3.4.6/lib/python3.4/tokenize.py", line 454, in open
        buffer = _builtin_open(filename, 'rb')
    FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-0j3kh6cz-build/setup.py'

Running marginalized_prediction

I tried to use the marginalized_prediction method of the pyrfr in SMAC. However I get following error when calling marginalized_prediction:

screen shot 2017-04-26 at 09 57 12

Also, if I try to run the pyrfr_test_partitioning.py file to check if the example works, I get another error:

screen shot 2017-04-26 at 09 58 33

I am using python 3.6 on MAC OS X in an anaconda environment with gcc 4.8.5.
Running pyrfr_example.py works just fine.

more meaningful default values

Forest and tree options should come with meaningful defaults. Maybe reintroducing num_data_points_per_tree = 0 to use as many as the data container has...

fix fANOVA tree

In order to speed up the fANOVA, the margninal_predict vectors of its trees have to be weighted_running_statistics in order to compute the variance properly. This will also require a 'scale_weights_by_factor' for the stats class to efficiently combine these for different subspace sizes .

Minimal version of swig?

Hi,
on the nemo cluster, swig is installed in version 2.0.10 and the installation fails:

    swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    ./include/rfr/trees/binary_fanova_tree.hpp:329: Error: Syntax error in input(3).

On our machines, swig is running in version 3.0.
So, I wanted ask whether there is a minimal required version of SWIG for pyrfr?

Cheers,
Marius

classifier in python

Greetings!

I noticed that, right now, only the regressor forest is available. Are there any plans to release the classifier python bindings?

Bug: Variance across trees

Hi,

Maybe I found a bug in the variance prediction.
The following mini example illustrates the issue.
In the example, we have only 2 data points and I train 100 trees until the end (no pruning) without bootstrapping. So, the variance prediction within each tree is zero.

Because of the different split points in each tree, the prediction on x=0.5 is different in each tree. So,
the variance across trees should be larger than 0.
Unfortunately, the function predict_mean_var() nevertheless returns 0 as variance.

@sfalkner Could you please look into this issue.

Thanks,
Marius

import numpy as np
from pyrfr import regression

rng = regression.default_random_engine(42)
rf_opts = regression.forest_opts()
rf_opts.num_trees = 100
rf_opts.do_bootstrapping = False
rf_opts.tree_opts.max_features = 1
rf_opts.tree_opts.min_samples_to_split = 1
rf_opts.tree_opts.min_samples_in_leaf = 1
rf_opts.tree_opts.max_depth = 100000
rf_opts.tree_opts.epsilon_purity = 0
rf_opts.tree_opts.max_num_nodes = 100000
rf_opts.num_data_points_per_tree = 2

rf = regression.binary_rss_forest()
rf.options = rf_opts

data = regression.default_data_container(1)

data.set_bounds_of_feature(0,0,1)

data.add_data_point(np.array([0.]), 0.)
data.add_data_point(np.array([1.]), 1.)

rf.fit(data, rng=rng)

m,v = rf.predict_mean_var(np.array([0.5]))

print(m,v)

preds_per_trees = rf.all_leaf_values(np.array([0.5]))
m = np.mean(preds_per_trees, axis=0)
v = np.var(preds_per_trees, axis=0)
print(m,v)```

Please provide an *official* license

👋 As a dependency of auto-sklearn I'm trying to help package pyrfr with conda:
conda-forge/staged-recipes#13919

Not having a proper license can be problematic if you want others to actually be able to use your work as implied below:

license='Use as you wish. No guarantees whatsoever.',

IANAL but I believe that if you don't have a license that laws in certain countries confer copyright or other rights as the authors whether or not you want them.

Whilst the current license implies that others are free to use the software, it's not an official, recognised license so it's unclear whether or not it would hold up in court. Because of this lack of legal clarity the software would not be allowed to be used in many enterprises.

If you do want anyone and everyone to be able to use your software I'd suggest using a license such as MIT which is very commonly used in the PyData stack.

Unable to install pyrfr on Mac OSX 10.15.7

Hi all,

I am trying to install pyrfr using the command: pip3 install pyrfr. But I am getting unsupported architecture error (error given below). Any way to circumvent this issue to install the python package?

  Building wheel for pyrfr (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /Library/Developer/CommandLineTools/usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"'; __file__='"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-wheel-h0hfwx4z --python-tag cp38
       cwd: /private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/
  Complete output (116 lines):
  running bdist_wheel
  running build
  running build_ext
  building 'pyrfr._regression' extension
  swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
  swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
  Deprecated command line option: -modern. This option is now always on.
  creating build
  creating build/temp.macosx-10.14.6-x86_64-3.8
  creating build/temp.macosx-10.14.6-x86_64-3.8/pyrfr
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -iwithsysroot/System/Library/Frameworks/System.framework/PrivateHeaders -iwithsysroot/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/Headers -arch arm64 -arch x86_64 -I./include -I/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8 -c pyrfr/regression_wrap.cpp -o build/temp.macosx-10.14.6-x86_64-3.8/pyrfr/regression_wrap.o -O2 -std=c++11
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/limits.h:57:
  In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:63:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/cdefs.h:807:2: error: Unsupported architecture
  #error Unsupported architecture
   ^
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/limits.h:57:
  In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:64:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/limits.h:8:2: error: architecture not supported
  #error architecture not supported
   ^
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:33:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/_types.h:34:2: error: architecture not supported
  #error architecture not supported
   ^
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:55:9: error: unknown type name '__int64_t'
  typedef __int64_t       __darwin_blkcnt_t;      /* total blocks */
          ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:56:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
  typedef __int32_t       __darwin_blksize_t;     /* preferred block size */
          ^
  note: '__int128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:57:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
  typedef __int32_t       __darwin_dev_t;         /* dev_t */
          ^
  note: '__int128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:60:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_gid_t;         /* [???] process and group IDs */
          ^
  note: '__uint128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:61:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_id_t;          /* [XSI] pid_t, uid_t, or gid_t*/
          ^
  note: '__uint128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:62:9: error: unknown type name '__uint64_t'
  typedef __uint64_t      __darwin_ino64_t;       /* [???] Used for 64 bit inodes */
          ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:68:9: error: unknown type name '__darwin_natural_t'
  typedef __darwin_natural_t __darwin_mach_port_name_t; /* Used by mach */
          ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:70:9: error: unknown type name '__uint16_t'; did you mean '__uint128_t'?
  typedef __uint16_t      __darwin_mode_t;        /* [???] Some file attributes */
          ^
  note: '__uint128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:71:9: error: unknown type name '__int64_t'
  typedef __int64_t       __darwin_off_t;         /* [???] Used for file sizes */
          ^
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:72:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
  typedef __int32_t       __darwin_pid_t;         /* [???] process and group IDs */
          ^
  note: '__int128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:73:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_sigset_t;      /* [???] signal set */
          ^
  note: '__uint128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:74:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
  typedef __int32_t       __darwin_suseconds_t;   /* [???] microseconds */
          ^
  note: '__int128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:75:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_uid_t;         /* [???] user IDs */
          ^
  note: '__uint128_t' declared here
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:76:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_useconds_t;    /* [???] microseconds */
          ^
  note: '__uint128_t' declared here
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:43:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
  typedef __uint32_t      __darwin_wctype_t;
          ^
  note: '__uint128_t' declared here
  In file included from pyrfr/regression_wrap.cpp:178:
  In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
  In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:75:
  In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types/_va_list.h:31:
  /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/types.h:37:2: error: architecture not supported
  #error architecture not supported
   ^
  fatal error: too many errors emitted, stopping now [-ferror-limit=]
  20 errors generated.
  error: command 'clang' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for pyrfr
  Running setup.py clean for pyrfr
Failed to build pyrfr
Installing collected packages: pyrfr
  Running setup.py install for pyrfr ... error
    ERROR: Command errored out with exit status 1:
     command: /Library/Developer/CommandLineTools/usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"'; __file__='"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-record-zz57ei0_/install-record.txt --single-version-externally-managed --compile
         cwd: /private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/
    Complete output (115 lines):
    running install
    running build_ext
    building 'pyrfr._regression' extension
    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
    swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    Deprecated command line option: -modern. This option is now always on.
    creating build
    creating build/temp.macosx-10.14.6-x86_64-3.8
    creating build/temp.macosx-10.14.6-x86_64-3.8/pyrfr
    clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -iwithsysroot/System/Library/Frameworks/System.framework/PrivateHeaders -iwithsysroot/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/Headers -arch arm64 -arch x86_64 -I./include -I/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8 -c pyrfr/regression_wrap.cpp -o build/temp.macosx-10.14.6-x86_64-3.8/pyrfr/regression_wrap.o -O2 -std=c++11
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/limits.h:57:
    In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:63:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/cdefs.h:807:2: error: Unsupported architecture
    #error Unsupported architecture
     ^
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/limits.h:57:
    In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:64:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/limits.h:8:2: error: architecture not supported
    #error architecture not supported
     ^
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:33:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/_types.h:34:2: error: architecture not supported
    #error architecture not supported
     ^
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:55:9: error: unknown type name '__int64_t'
    typedef __int64_t       __darwin_blkcnt_t;      /* total blocks */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:56:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_blksize_t;     /* preferred block size */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:57:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_dev_t;         /* dev_t */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:60:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_gid_t;         /* [???] process and group IDs */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:61:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_id_t;          /* [XSI] pid_t, uid_t, or gid_t*/
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:62:9: error: unknown type name '__uint64_t'
    typedef __uint64_t      __darwin_ino64_t;       /* [???] Used for 64 bit inodes */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:68:9: error: unknown type name '__darwin_natural_t'
    typedef __darwin_natural_t __darwin_mach_port_name_t; /* Used by mach */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:70:9: error: unknown type name '__uint16_t'; did you mean '__uint128_t'?
    typedef __uint16_t      __darwin_mode_t;        /* [???] Some file attributes */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:71:9: error: unknown type name '__int64_t'
    typedef __int64_t       __darwin_off_t;         /* [???] Used for file sizes */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:72:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_pid_t;         /* [???] process and group IDs */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:73:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_sigset_t;      /* [???] signal set */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:74:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_suseconds_t;   /* [???] microseconds */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:75:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_uid_t;         /* [???] user IDs */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:76:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_useconds_t;    /* [???] microseconds */
            ^
    note: '__uint128_t' declared here
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:43:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_wctype_t;
            ^
    note: '__uint128_t' declared here
    In file included from pyrfr/regression_wrap.cpp:178:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/stdio.h:107:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:75:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types/_va_list.h:31:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/types.h:37:2: error: architecture not supported
    #error architecture not supported
     ^
    fatal error: too many errors emitted, stopping now [-ferror-limit=]
    20 errors generated.
    error: command 'clang' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /Library/Developer/CommandLineTools/usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"'; __file__='"'"'/private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-install-q7iwj9uz/pyrfr/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/99/l96sxhh57px2fhhl6cc0b2_80000gn/T/pip-record-zz57ei0_/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.

unit_test_fanova fails on macos

: Test command: /Users/wsk/random_forest_run/build/tests/ut_fanova "/Users/wsk/random_forest_run/test_data_sets/"
5: Test timeout computed to be: 9.99988e+06
5: Boost.Test WARNING: token "/Users/wsk/random_forest_run/test_data_sets/" does not correspond to the Boost.Test argument
5: and should be placed after all Boost.Test arguments and the -- separator.
5: For example: ut_fanova --random -- /Users/wsk/random_forest_run/test_data_sets/
5: Running 1 test case...
5: /Users/wsk/random_forest_run/tests/unit_test_fanova.cpp:269: fatal error: in "legacy_fanova_test": difference{0.00697191} between the_tree.get_subspace_size(1){177.37526839097126} and s1{178.61191331472548} exceeds 1e-06%
5:
5: *** 1 failure is detected in the test module "ut_fanova"
5/8 Test #5: ut_fanova ........................***Failed 0.01 sec

(New) Installation error

On Ubuntu 16.04, swig and Cython are installed. Is there anything else that I need?

Collecting pyrfr
  Using cached https://files.pythonhosted.org/packages/21/4c/58533c51ab301f61d3521dc4cd29ba8145eed8f11b84f70aba9fd28f6aca/pyrfr-0.4.0.tar.gz
Building wheels for collected packages: pyrfr
  Running setup.py bdist_wheel for pyrfr ... error
  Complete output from command /home/janvanrijn/anaconda3/envs/openml-multitask/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-8xQhnL/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmp33qToPpip-wheel- --python-tag cp27:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-2.7
  creating build/lib.linux-x86_64-2.7/pyrfr
  copying pyrfr/__init__.py -> build/lib.linux-x86_64-2.7/pyrfr
  running build_ext
  building '_regression' extension
  swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
  swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
  creating build/temp.linux-x86_64-2.7
  creating build/temp.linux-x86_64-2.7/pyrfr
  gcc -pthread -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/home/janvanrijn/anaconda3/envs/openml-multitask/include/python2.7 -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-2.7/pyrfr/regression_wrap.o -O2 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  g++ -pthread -shared -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,-rpath=/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/pyrfr/regression_wrap.o -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/_regression.so
  building '_util' extension
  swigging pyrfr/util.i to pyrfr/util_wrap.cpp
  swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/util_wrap.cpp pyrfr/util.i
  gcc -pthread -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/home/janvanrijn/anaconda3/envs/openml-multitask/include/python2.7 -c pyrfr/util_wrap.cpp -o build/temp.linux-x86_64-2.7/pyrfr/util_wrap.o -O2 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  g++ -pthread -shared -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,-rpath=/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/pyrfr/util_wrap.o -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/_util.so
  installing to build/bdist.linux-x86_64/wheel
  running install
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-build-8xQhnL/pyrfr/setup.py", line 42, in <module>
      cmdclass={'install': CustomInstall}
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/core.py", line 151, in setup
      dist.run_commands()
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/dist.py", line 953, in run_commands
      self.run_command(cmd)
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/dist.py", line 972, in run_command
      cmd_obj.run()
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/site-packages/wheel/bdist_wheel.py", line 238, in run
      self.run_command('install')
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/cmd.py", line 326, in run_command
      self.distribution.run_command(command)
    File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/dist.py", line 972, in run_command
      cmd_obj.run()
    File "/tmp/pip-build-8xQhnL/pyrfr/setup.py", line 9, in run
      build.run(self)
  TypeError: unbound method run() must be called with build instance as first argument (got CustomInstall instance instead)
  
  ----------------------------------------
  Failed building wheel for pyrfr
  Running setup.py clean for pyrfr
Failed to build pyrfr
Installing collected packages: pyrfr
  Running setup.py install for pyrfr ... error
    Complete output from command /home/janvanrijn/anaconda3/envs/openml-multitask/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-8xQhnL/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-y2NuV5-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build_ext
    building '_regression' extension
    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
    swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    creating build
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/pyrfr
    gcc -pthread -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/home/janvanrijn/anaconda3/envs/openml-multitask/include/python2.7 -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-2.7/pyrfr/regression_wrap.o -O2 -std=c++11
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    creating build/lib.linux-x86_64-2.7
    g++ -pthread -shared -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,-rpath=/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/pyrfr/regression_wrap.o -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/_regression.so
    building '_util' extension
    swigging pyrfr/util.i to pyrfr/util_wrap.cpp
    swig -python -c++ -I${CMAKE_SOURCE_DIR}/include -I./include -o pyrfr/util_wrap.cpp pyrfr/util.i
    gcc -pthread -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I${CMAKE_SOURCE_DIR}/include -I./include -I/home/janvanrijn/anaconda3/envs/openml-multitask/include/python2.7 -c pyrfr/util_wrap.cpp -o build/temp.linux-x86_64-2.7/pyrfr/util_wrap.o -O2 -std=c++11
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    g++ -pthread -shared -B /home/janvanrijn/anaconda3/envs/openml-multitask/compiler_compat -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,-rpath=/home/janvanrijn/anaconda3/envs/openml-multitask/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/pyrfr/util_wrap.o -L/home/janvanrijn/anaconda3/envs/openml-multitask/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/_util.so
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-8xQhnL/pyrfr/setup.py", line 42, in <module>
        cmdclass={'install': CustomInstall}
      File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/core.py", line 151, in setup
        dist.run_commands()
      File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/dist.py", line 953, in run_commands
        self.run_command(cmd)
      File "/home/janvanrijn/anaconda3/envs/openml-multitask/lib/python2.7/distutils/dist.py", line 972, in run_command
        cmd_obj.run()
      File "/tmp/pip-build-8xQhnL/pyrfr/setup.py", line 9, in run
        build.run(self)
    TypeError: unbound method run() must be called with build instance as first argument (got CustomInstall instance instead)
    
    ----------------------------------------
Command "/home/janvanrijn/anaconda3/envs/openml-multitask/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-8xQhnL/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-y2NuV5-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-8xQhnL/pyrfr/

Python 3.11 wheel

PyPI has wheels up to Python 3.10, but the latest version is 3.11, so it would be great to have binaries for that, too.

Win installation error

WIN11
anaconda python 3.7

Building wheel for pyrfr (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      [<setuptools.extension.Extension('pyrfr._regression') at 0x18d195b3488>, <setuptools.extension.Extension('pyrfr._util') at 0x18d1be1a288>]
      running bdist_wheel
      running build
      running build_ext
      building 'pyrfr._regression' extension
      swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
      swig.exe -python -c++ -modern -py3 -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
      error: command 'swig.exe' failed: None
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pyrfr
  Running setup.py clean for pyrfr
Successfully built lap smac pynisher
Failed to build pyrfr
Installing collected packages: sortedcontainers, pytz, msgpack, lap, heapdict, geojson, easydict, zipp, zict, xmltodict, Werkzeug, typing-extensions, tornado, toolz, tifffile, threadpoolctl, tblib, shapely, scipy, regex, pyzmq, pyyaml, PyWavelets, python-dateutil, pyrfr, pyparsing, pycryptodome, psutil, opencv-contrib-python, networkx, natsort, munch, MarkupSafe, locket, llvmlite, joblib, itsdangerous, imageio, future, fsspec, fonttools, et-xmlfile, emcee, cython, cycler, colorama, cloudpickle, chardet, Babel, tqdm, scikit-learn, pynisher, partd, pandas, packaging, openpyxl, numba, kiwisolver, Jinja2, importlib-metadata, ConfigSpace, bce-python-sdk, scikit-image, motmetrics, matplotlib, dask, click, pycocotools, flask, distributed, smac, Flask-Babel, visualdl, paddleslim
  Running setup.py install for pyrfr ... error
  error: subprocess-exited-with-error

  × Running setup.py install for pyrfr did not run successfully.
  │ exit code: 1
  ╰─> [9 lines of output]
      [<setuptools.extension.Extension('pyrfr._regression') at 0x1e614c9d508>, <setuptools.extension.Extension('pyrfr._util') at 0x1e6173c3dc8>]
      running install
      D:\anaconda3\envs\PaddleRS\lib\site-packages\setuptools\command\install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        setuptools.SetuptoolsDeprecationWarning,
      running build_ext
      building 'pyrfr._regression' extension
      swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
      swig.exe -python -c++ -modern -py3 -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
      error: command 'swig.exe' failed: None
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> pyrfr

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Automatically set num_features when adding new datapoints

When initializing a data container num_features is optional and if not given automatically set to 0. Then, it is not possible to add data points due to a runtime error (RuntimeError: Number of elements does not match.)

from pyrfr import regression
data = regression.data_container()
data.add_data_point([1,2], 1)

Could data_container either automatically adjust num_features or throw a error message when adding the first data sample?

num_data_points_per_tree = 1 grows the trees with only one sample for the whole forrest

If the option num_data_points_per_tree is set to 1 all trees of the forrest are grown using exactly the same sample.

The issue can be reproduced with the following

import numpy as np
from pyrfr import regression as reg
X = np.array([
    [0., 0., 0.],
    [0., 0., 1.],
    [0., 1., 0.],
    [0., 1., 1.],
    [1., 0., 0.],
    [1., 0., 1.],
    [1., 1., 0.],
    [1., 1., 1.]], dtype=np.float64)
y = np.array([
    [.1],
    [.2],
    [9],
    [9.2],
    [100.],
    [100.2],
    [109.],
    [109.2]], dtype=np.float64)


rng = reg.default_random_engine(12345)

rf_opts = reg.forest_opts()
rf_opts.num_trees = 10
rf_opts.seed = 12345
rf_opts.do_bootstrapping = True
rf_opts.max_features = 3
rf_opts.min_samples_to_split = 3
rf_opts.min_samples_in_leaf = 3
rf_opts.max_depth = 20
rf_opts.epsilon_purity = 1e-8
rf_opts.max_num_nodes = 1000


print('='*120)
print('One data point per tree')
print('='*120)
rf_opts.num_data_points_per_tree = 1  # for n > 1 the forrest works

forrest = reg.binary_rss_forest()
forrest.options = rf_opts

data = reg.data_container(X.shape[1])
for row_x, row_y in zip(X, y.flatten()):
    data.add_data_point(row_x, row_y)

forrest.fit(data, rng=rng)

for i in range(X.shape[0]):
    print('Predicting for sample %d: ' % i)
    print('Sample %d: %s' % (i, str(X[i])))
    print('Leaf-Values: %s' % str(forrest.all_leaf_values(data.retrieve_data_point(i))))
    print('ŷ: ' + str(forrest.predict(data.retrieve_data_point(i))))
    print('y: ' + str(data.response(i)))
    print('#'*120)

Building pyrfr on circle-ci fails

In order to build the SMAC documentation online on circle-ci, I need to install the pyrfr. For reasons I don't understand, this fails. There's probably just a dependency missing or in a wrong version, but I don't know how to figure this out.

Here are some links which might be helpful:

Failing build

circle.yml

forest options are not used in fANOVA forest

import pyrfr.regression as reg
data = reg.data_container()
data.import_csv_files('../test_data_sets/toy_data_set_features.csv', '../test_data_sets/toy_data_set_features.csv')
opts = reg.forest_opts()
opts.num_trees=12
f2 = reg.fanova_forest(opts)
f2.fit(data,rng)

fails with

RuntimeError                              Traceback (most recent call last)
<ipython-input-13-6c1e73434891> in <module>()
----> 1 f2.fit(data,rng)

/ihome/sfalkner/repositories/github/random_forest_run/build/pyrfr/regression.py in fit(self, data, rng)
   2098 
   2099         """
-> 2100         return _regression.fanova_forest_fit(self, data, rng)
   2101 
   2102 

RuntimeError: The number of data points per tree is set to zero!

OOB becomes nan for larger numbers of num_data_points_per_tree

Here the results when performing forward-selection with PIMP:

num_data_points_per_tree=500
X.shape (3741, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: 0.306053
RMSE: 0.298566

Evaluating sp-clause-decay
OOB: 0.303003
RMSE: 0.296279

Evaluating sp-clause-del-heur
OOB: 0.337701
RMSE: 0.335855

num_data_points_per_tree=1500
X.shape (3741, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: 0.311971
RMSE: 0.297653

Evaluating sp-clause-decay
OOB: 0.300374
RMSE: 0.289780

Evaluating sp-clause-del-heur
OOB: 0.337702
RMSE: 0.336227

num_data_points_per_tree=2000
X.shape (3741, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: 0.318140
RMSE: 0.302362

Evaluating sp-clause-decay
OOB: nan
RMSE: 0.298979

Evaluating sp-clause-del-heur
OOB: 0.337619
RMSE: 0.336822

num_data_points_per_tree=3741
X.shape (3741, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: nan
RMSE: 0.277847

Evaluating sp-clause-decay
OOB: nan
RMSE: 0.281619

Evaluating sp-clause-del-heur
OOB: nan
RMSE: 0.335883

If I limit the number of datapoints in X:
num_data_points_per_tree=250
X.shape (500, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: 0.203998
RMSE: 0.338962

Evaluating sp-clause-decay
OOB: 0.200869
RMSE: 0.330463

Evaluating sp-clause-del-heur
OOB: 0.208321
RMSE: 0.350660

num_data_points_per_tree=333
X.shape (500, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: nan
RMSE: 0.348709

Evaluating sp-clause-decay
OOB: 0.233963
RMSE: 0.343261

Evaluating sp-clause-del-heur
OOB: 0.234469
RMSE: 0.357129

num_data_points_per_tree=500
X.shape (500, 55)

INFO:Forward Selection
Evaluating sp-clause-activity-inc
OOB: nan
RMSE: 0.331204

Evaluating sp-clause-decay
OOB: nan
RMSE: 0.322969

Evaluating sp-clause-del-heur
OOB: nan
RMSE: 0.348493

Predictions in Log-Cost-Space

I hope that I correctly remember our discussion yesterday about the predictions in log-cost space since I forgot my notes in my office. @frank-hutter if anything is wrong, please correct me.

@sfalkner Frank explained yesterday how he implemented the prediction in log(cost) space in SMAC and I don't know whether this is right now possible with the new RF. I hope you can please help us here.

  • Train the RF using log(cost) values
  • to get a marginalized prediction over instances
    1. compute a marginalized prediction for each tree using exp(log(cost)) of all values in the leafs -> one prediction for each tree in the original cost space
    2. mean and variance over all log(pred_t) for each t (so, again in the log-space)

How can we compute this with the RF? Is it possible using the python interface? Would it be inefficient to do it in Python? Can it be done within C++?

RuntimeError: Second statistics must not contain as many points as first one!

The cryptic error message "RuntimeError: Second statistics must not contain as many points as first one!"
is thrown when trying to fit the forest with data.

This minimal example is able to reproduce the issue

y = np.array([
    0.170108248251,
    0.930876679459,
    0.905099378138,
    0.596442160851,
    -5.0,
    -5.0,
    0.309617890025,
    0.487095030376,
    1.35073047692,
    0.318355093365,
])

X = np.array([
    [0., 0.10322601, 0.35714272],
    [1., 0.48144905, 0.92857184],
    [0., 0.02233043, 0.54761909],
    [0., 0.21689149, 0.64285728],
    [0., 0.47701299, 0.73809546],
    [0., 0.16681381, 0.07142816],
    [0., 0.72408484, 0.11904726],
    [0., 0.97184578, 0.45238091],
    [1., 0.32984456, 0.88095274],
    [0., 0.89288871, 0.45238091],
])

data = reg.data_container(X.shape[1])
data.set_type_of_feature(0, 2)
data.set_bounds_of_feature(1, 0.0, 1.0)
data.set_bounds_of_feature(2, 0.0, 1.0)

for row_x, row_y in zip(X[:10], y.flatten()[:10]):
    data.add_data_point(row_x, row_y)

# for i in range(10):
#     print(data.retrieve_data_point(i), data.response(i))

rng = reg.default_random_engine(12345)

rf_opts = reg.forest_opts()
rf_opts.num_trees = 10
rf_opts.seed = 12345
rf_opts.do_bootstrapping = True
rf_opts.max_features = 3
rf_opts.min_samples_to_split = 3
rf_opts.min_samples_in_leaf = 3
rf_opts.max_depth = 20
rf_opts.epsilon_purity = 1e-8
rf_opts.max_num_nodes = 1000
rf_opts.num_data_points_per_tree = X.shape[0]

forest = reg.binary_rss_forest()
forest.options = rf_opts

forest.fit(data, rng=rng)

Installation error limits.h

Installed swig, cmake, gcc in conda, however we get the following error:



(gensim_env) dyn-160-39-178-198:~ prernakashyap$ pip install pyrfr

Collecting pyrfr

  Using cached https://files.pythonhosted.org/packages/ed/0f/4d7e42a9dfef3a1898e03cffa8f1cfcd1f96507d718808b2db584c6f8401/pyrfr-0.8.0.tar.gz

Building wheels for collected packages: pyrfr

  Running setup.py bdist_wheel for pyrfr ... error

  Complete output from command /Users/prernakashyap/anaconda3/envs/gensim_env/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-install-9wu83pgt/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-wheel-ee82_ljr --python-tag cp36:

  running bdist_wheel

  running build

  running build_ext

  building 'pyrfr._regression' extension

  swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp

  swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i

  creating build

  creating build/temp.macosx-10.7-x86_64-3.6

  creating build/temp.macosx-10.7-x86_64-3.6/pyrfr

  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/prernakashyap/anaconda3/envs/gensim_env/include -arch x86_64 -I/Users/prernakashyap/anaconda3/envs/gensim_env/include -arch x86_64 -I./include -I/Users/prernakashyap/anaconda3/envs/gensim_env/include/python3.6m -c pyrfr/regression_wrap.cpp -o build/temp.macosx-10.7-x86_64-3.6/pyrfr/regression_wrap.o -O2 -std=c++11

  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]

  In file included from /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/syslimits.h:7:0,

                   from /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:34,

                   from /Users/prernakashyap/anaconda3/envs/gensim_env/include/python3.6m/Python.h:11,

                   from pyrfr/regression_wrap.cpp:173:

  /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory

   #include_next <limits.h>  /* recurse down to the real one */

                                                               ^

  compilation terminated.

  error: command 'gcc' failed with exit status 1

  

  ----------------------------------------

  Failed building wheel for pyrfr

  Running setup.py clean for pyrfr

Failed to build pyrfr

Installing collected packages: pyrfr

  Running setup.py install for pyrfr ... error

    Complete output from command /Users/prernakashyap/anaconda3/envs/gensim_env/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-install-9wu83pgt/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-record-9qiyacqw/install-record.txt --single-version-externally-managed --compile:

    running install

    running build_ext

    building 'pyrfr._regression' extension

    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp

    swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i

    creating build

    creating build/temp.macosx-10.7-x86_64-3.6

    creating build/temp.macosx-10.7-x86_64-3.6/pyrfr

    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/prernakashyap/anaconda3/envs/gensim_env/include -arch x86_64 -I/Users/prernakashyap/anaconda3/envs/gensim_env/include -arch x86_64 -I./include -I/Users/prernakashyap/anaconda3/envs/gensim_env/include/python3.6m -c pyrfr/regression_wrap.cpp -o build/temp.macosx-10.7-x86_64-3.6/pyrfr/regression_wrap.o -O2 -std=c++11

    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]

    In file included from /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/syslimits.h:7:0,

                     from /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:34,

                     from /Users/prernakashyap/anaconda3/envs/gensim_env/include/python3.6m/Python.h:11,

                     from pyrfr/regression_wrap.cpp:173:

    /Users/prernakashyap/anaconda3/envs/gensim_env/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory

     #include_next <limits.h>  /* recurse down to the real one */

                                                                 ^

    compilation terminated.

    error: command 'gcc' failed with exit status 1

    

    ----------------------------------------

Command "/Users/prernakashyap/anaconda3/envs/gensim_env/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-install-9wu83pgt/pyrfr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-record-9qiyacqw/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/ty/2lwym9l571l4xvwp44x5c43c0000gn/T/pip-install-9wu83pgt/pyrfr/

Did someone encounter this problem before? Conda version is 4.5.X

DEPRECATION: pyrfr is being installed using the legacy 'setup.py install' method

pyrfr version: 0.8.3

I tried to pip install pyrfr on GitHub actions' ["ubuntu-latest", "macos-latest", "windows-latest"]. On macos and windows, I got the deprecation warning:

  DEPRECATION: pyrfr is being installed using the legacy 'setup.py install' method, because 
it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce 
this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion 
can be found at https://github.com/pypa/pip/issues/8559
  Running setup.py install for pyrfr: started
  Running setup.py install for pyrfr: still running...
  Running setup.py install for pyrfr: finished with status 'done'

The installation was successful, though. Then I tried to install wheel before installing pyrfr, the output message is normal:

Building wheels for collected packages: pyrfr
  Building wheel for pyrfr (setup.py): started
  Building wheel for pyrfr (setup.py): still running...
  Building wheel for pyrfr (setup.py): finished with status 'done'
  Created wheel for pyrfr: filename=pyrfr-0.8.3-cp38-cp38-macosx_10_15_x86_64.whl size=508192 sha256=5ac670c98adbb1ade8c32d3a0b8bba414fcf384ec8b40eef56089ba5d80f6195
  Stored in directory: /Users/runner/Library/Caches/pip/wheels/3b/ca/fd/ed04af714a195453416b55d093a747fb022d058772f64519c4
Successfully built pyrfr

Is it a problem for me to concern?

Build Failiure

I was trying to build this with cmake --build . but it gives me the following error:

/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: warning: relocation against `_ZN5boost9unit_test12lazy_ostream4instE' in read-only section `.text._ZN5boost9unit_test12lazy_ostream8instanceEv[_ZN5boost9unit_test12lazy_ostream8instanceEv]'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `main':
unit_test_regression_forest.cpp:(.text+0x4f): undefined reference to `boost::unit_test::unit_test_main(bool (*)(), int, char**)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `load_diabetes_data()':
unit_test_regression_forest.cpp:(.text+0x258): undefined reference to `boost::unit_test::framework::master_test_suite()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x3a2): undefined reference to `boost::unit_test::framework::master_test_suite()'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_serialize_test_invoker()':
unit_test_regression_forest.cpp:(.text+0x928): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb05): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xd1c): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xf33): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x113e): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o:unit_test_regression_forest.cpp:(.text+0x1623): more undefined references to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)' follow
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_serialize_test::test_method()':
unit_test_regression_forest.cpp:(.text+0x1775): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x18df): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x1a34): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x1b50): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x2330): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x24ec): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_update_downdate_tests_invoker()':
unit_test_regression_forest.cpp:(.text+0x2f10): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x30ed): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o:unit_test_regression_forest.cpp:(.text+0x3304): more undefined references to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)' follow
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_update_downdate_tests::test_method()':
unit_test_regression_forest.cpp:(.text+0x3fb0): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x40cc): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x4312): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x442e): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x4598): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x4838): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x4a4d): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_exceptions_tests_invoker()':
unit_test_regression_forest.cpp:(.text+0x5302): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x54df): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x56f6): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x590d): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x5b18): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o:unit_test_regression_forest.cpp:(.text+0x5fd7): more undefined references to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)' follow
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `regression_forest_exceptions_tests::test_method()':
unit_test_regression_forest.cpp:(.text+0x618d): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x62d3): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x63a6): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6489): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x65c5): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6698): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x677b): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6919): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x69ec): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6acf): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6d73): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x6e56): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7172): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7255): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7571): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7654): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7970): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7a4d): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `quantile_regression_forest_test_invoker()':
unit_test_regression_forest.cpp:(.text+0x7df1): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x7fce): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x81e5): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x83fc): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x8607): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o:unit_test_regression_forest.cpp:(.text+0x8a96): more undefined references to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)' follow
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `quantile_regression_forest_test::test_method()':
unit_test_regression_forest.cpp:(.text+0x8c4c): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x9035): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x935d): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x94fb): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x974b): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x98e9): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x99e9): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x9bb5): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x9c98): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x9dca): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0x9f79): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xa05c): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xa3c6): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xa4a9): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xaaed): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xabd0): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xaf6b): undefined reference to `boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb048): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `__static_initialization_and_destruction_0(int, int)':
unit_test_regression_forest.cpp:(.text+0xb3ee): undefined reference to `boost::unit_test::unit_test_log_t::instance()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb511): undefined reference to `boost::unit_test::framework::impl::master_test_suite_name_setter::master_test_suite_name_setter(boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb5c8): undefined reference to `boost::unit_test::decorator::collector_t::instance()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb6a5): undefined reference to `boost::unit_test::ut_detail::auto_test_unit_registrar::auto_test_unit_registrar(boost::unit_test::test_case*, boost::unit_test::decorator::collector_t&, unsigned long)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb6f9): undefined reference to `boost::unit_test::decorator::collector_t::instance()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb7d6): undefined reference to `boost::unit_test::ut_detail::auto_test_unit_registrar::auto_test_unit_registrar(boost::unit_test::test_case*, boost::unit_test::decorator::collector_t&, unsigned long)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb82a): undefined reference to `boost::unit_test::decorator::collector_t::instance()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb8fb): undefined reference to `boost::unit_test::ut_detail::auto_test_unit_registrar::auto_test_unit_registrar(boost::unit_test::test_case*, boost::unit_test::decorator::collector_t&, unsigned long)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xb94c): undefined reference to `boost::unit_test::decorator::collector_t::instance()'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text+0xba17): undefined reference to `boost::unit_test::ut_detail::auto_test_unit_registrar::auto_test_unit_registrar(boost::unit_test::test_case*, boost::unit_test::decorator::collector_t&, unsigned long)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `boost::unit_test::lazy_ostream::instance()':
unit_test_regression_forest.cpp:(.text._ZN5boost9unit_test12lazy_ostream8instanceEv[_ZN5boost9unit_test12lazy_ostream8instanceEv]+0x19): undefined reference to `boost::unit_test::lazy_ostream::inst'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `boost::unit_test::make_test_case(boost::function<void ()> const&, boost::unit_test::basic_cstring<char const>, boost::unit_test::basic_cstring<char const>, unsigned long)':
unit_test_regression_forest.cpp:(.text._ZN5boost9unit_test14make_test_caseERKNS_8functionIFvvEEENS0_13basic_cstringIKcEES8_m[_ZN5boost9unit_test14make_test_caseERKNS_8functionIFvvEEENS0_13basic_cstringIKcEES8_m]+0x69): undefined reference to `boost::unit_test::ut_detail::normalize_test_case_name[abi:cxx11](boost::unit_test::basic_cstring<char const>)'
/usr/bin/ld: unit_test_regression_forest.cpp:(.text._ZN5boost9unit_test14make_test_caseERKNS_8functionIFvvEEENS0_13basic_cstringIKcEES8_m[_ZN5boost9unit_test14make_test_caseERKNS_8functionIFvvEEENS0_13basic_cstringIKcEES8_m]+0x10e): undefined reference to `boost::unit_test::test_case::test_case(boost::unit_test::basic_cstring<char const>, boost::unit_test::basic_cstring<char const>, unsigned long, boost::function<void ()> const&)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `bool boost::test_tools::tt_detail::check_frwd<boost::test_tools::tt_detail::equal_impl_frwd, unsigned long, int>(boost::test_tools::tt_detail::equal_impl_frwd, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long const&, char const*, int const&, char const*)':
unit_test_regression_forest.cpp:(.text._ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEmiEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_[_ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEmiEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_]+0x1ad): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `bool boost::test_tools::tt_detail::check_frwd<boost::test_tools::tt_detail::equal_impl_frwd, double, double>(boost::test_tools::tt_detail::equal_impl_frwd, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, double const&, char const*, double const&, char const*)':
unit_test_regression_forest.cpp:(.text._ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEddEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_[_ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEddEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_]+0x1ad): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: CMakeFiles/ut_regression_forest.dir/unit_test_regression_forest.cpp.o: in function `bool boost::test_tools::tt_detail::check_frwd<boost::test_tools::tt_detail::equal_impl_frwd, unsigned long, unsigned long>(boost::test_tools::tt_detail::equal_impl_frwd, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long const&, char const*, unsigned long const&, char const*)':
unit_test_regression_forest.cpp:(.text._ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEmmEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_[_ZN5boost10test_tools9tt_detail10check_frwdINS1_15equal_impl_frwdEmmEEbT_RKNS_9unit_test12lazy_ostreamENS5_13basic_cstringIKcEEmNS1_10tool_levelENS1_10check_typeERKT0_PSA_RKT1_SH_]+0x1ad): undefined reference to `boost::test_tools::tt_detail::report_assertion(boost::test_tools::assertion_result const&, boost::unit_test::lazy_ostream const&, boost::unit_test::basic_cstring<char const>, unsigned long, boost::test_tools::tt_detail::tool_level, boost::test_tools::tt_detail::check_type, unsigned long, ...)'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
make[2]: *** [tests/CMakeFiles/ut_regression_forest.dir/build.make:103: tests/ut_regression_forest] Error 1
make[1]: *** [CMakeFiles/Makefile2:127: tests/CMakeFiles/ut_regression_forest.dir/all] Error 2
make: *** [Makefile:114: all] Error 2

I am using gcc 10.2.1 and boost installed libboost-all-dev 1.74.0.3 on debian 11.

Additional Debugging information

$ apt-cache policy libboost-all-dev
libboost-all-dev:
  Installed: 1.74.0.3
  Candidate: 1.74.0.3
  Version table:
 *** 1.74.0.3 500
        500 http://deb.debian.org/debian bullseye/main amd64 Packages
        100 /var/lib/dpkg/status
$ gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ uname -a
Linux 58386a9d0590 5.19.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 05 Sep 2022 18:09:09 +0000 x86_64 GNU/Linux
$ cmake --version 
cmake version 3.18.4

CMake suite maintained and supported by Kitware (kitware.com/cmake).
$ make --version
GNU Make 4.3
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:        11
Codename:       bullseye
$ swig -version

SWIG Version 4.0.2

Compiled with g++ [x86_64-pc-linux-gnu]

Configured options: +pcre

Please see http://www.swig.org for reporting bugs and further information

Installation copies files into include directory

I think this is highly related to #27, but I nevertheless opened a new issue because I'm using a newer version of the pyrfr (0.6.0). Upon calling pip install pyrfr, the content of my include-directory changes from:

cloog  lzma    mpf2mpfr.h  pcre.h              pcrecpp.h     python3.6m  sqlite3ext.h  tclPlatDecls.h     tk.h           zconf.h
gmp.h  lzma.h  mpfr.h      pcre_scanner.h      pcrecpparg.h  readline    tcl.h         tclTomMath.h       tkDecls.h      zlib.h
isl    mpc.h   openssl     pcre_stringpiece.h  pcreposix.h   sqlite3.h   tclDecls.h    tclTomMathDecls.h  tkPlatDecls.h

to

access.hpp                                 filereadstream.h                pcre.h                          stdint.h
adapters.hpp                               filewritestream.h               pcre_scanner.h                  stream.h
allocators.h                               forest_options.hpp              pcre_stringpiece.h              strfunc.h
array.hpp                                  forward_list.hpp                pcrecpp.h                       string.hpp
array_wrapper.hpp                          functional.hpp                  pcrecpparg.h                    stringbuffer.h
base64.hpp                                 fwd.h                           pcreposix.h                     strtod.h
base_class.hpp                             gmp.h                           pointer.h                       swap.h
biginteger.h                               helpers.hpp                     polymorphic.hpp                 tcl.h
binary.hpp                                 ieee754.h                       polymorphic_impl.hpp            tclDecls.h
binary_fanova_tree.hpp                     inttypes.h                      polymorphic_impl_fwd.hpp        tclPlatDecls.h
binary_split_one_feature_rss_loss.hpp      isl                             portable_binary.hpp             tclTomMath.h
bitset.hpp                                 istreamwrapper.h                pow10.h                         tclTomMathDecls.h
boost_variant.hpp                          itoa.h                          prettywriter.h                  temporary_node.hpp
cereal.hpp                                 json.hpp                        python3.6m                      tk.h
chrono.hpp                                 k_ary_node.hpp                  quantile_regression_forest.hpp  tkDecls.h
classification_forest.hpp                  k_ary_tree.hpp                  queue.hpp                       tkPlatDecls.h
classification_split.hpp                   license.txt                     rapidjson.h                     traits.hpp
cloog                                      list.hpp                        rapidxml.hpp                    tree_base.hpp
common.hpp                                 lzma                            rapidxml_iterators.hpp          tree_options.hpp
complex.hpp                                lzma.h                          rapidxml_print.hpp              tuple.hpp
data_container.hpp                         macros.hpp                      rapidxml_utils.hpp              unordered_map.hpp
data_container_utils.hpp                   manual.html                     reader.h                        unordered_set.hpp
default_data_container.hpp                 map.hpp                         readline                        util.hpp
default_data_container_with_instances.hpp  memory.hpp                      regex.h                         utility.hpp
deque.hpp                                  memorybuffer.h                  regression_forest.hpp           valarray.hpp
diyfp.h                                    memorystream.h                  schema.h                        vector.hpp
document.h                                 meta.h                          set.hpp                         writer.h
dtoa.h                                     mpc.h                           split_base.hpp                  xml.hpp
en.h                                       mpf2mpfr.h                      sqlite3.h                       zconf.h
encodedstream.h                            mpfr.h                          sqlite3ext.h                    zlib.h
encodings.h                                openssl                         stack.h
error.h                                    ostreamwrapper.h                stack.hpp
fanova_forest.hpp                          pair_associative_container.hpp  static_object.hpp

Among those, the file inttypes.h is particularly malicious as it crashes on machines not having MS Visual Studio:

// ISO C9x  compliant inttypes.h for Microsoft Visual Studio
// Based on ISO/IEC 9899:TC2 Committee draft (May 6, 2005) WG14/N1124 
// 
//  Copyright (c) 2006-2013 Alexander Chemeris
// 
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
// 
//   1. Redistributions of source code must retain the above copyright notice,
//      this list of conditions and the following disclaimer.
// 
//   2. Redistributions in binary form must reproduce the above copyright
//      notice, this list of conditions and the following disclaimer in the
//      documentation and/or other materials provided with the distribution.
// 
//   3. Neither the name of the product nor the names of its contributors may
//      be used to endorse or promote products derived from this software
//      without specific prior written permission.
// 
// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED
// WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
// EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
// OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 
// WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
// OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
// ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// 
///////////////////////////////////////////////////////////////////////////////

// The above software in this distribution may have been modified by 
// THL A29 Limited ("Tencent Modifications"). 
// All Tencent Modifications are Copyright (C) 2015 THL A29 Limited.

#ifndef _MSC_VER // [
#error "Use this header only with Microsoft Visual C++ compilers!"
#endif // _MSC_VER ]

Which makes it impossible to install Auto-sklearn: https://travis-ci.org/automl/auto-sklearn/jobs/282169539

Package can be uninstalled twice

When I install the package a single time, I can uninstall it twice afterwards:

root@1f73fbac46e1:/# pip3 install pyrfr
Collecting pyrfr
  Downloading pyrfr-0.7.0.tar.gz (290kB)
    100% |################################| 296kB 4.0MB/s 
Building wheels for collected packages: pyrfr
  Running setup.py bdist_wheel for pyrfr ... error
  Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-kgescy03/pyrfr/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /tmp/tmpfbcwaccepip-wheel- --python-tag cp35:
  /usr/lib/python3.5/distutils/dist.py:261: UserWarning: Unknown distribution option: 'python_requires'
    warnings.warn(msg)
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.5
  creating build/lib.linux-x86_64-3.5/pyrfr
  copying pyrfr/__init__.py -> build/lib.linux-x86_64-3.5/pyrfr
  copying pyrfr/docstrings.i -> build/lib.linux-x86_64-3.5/pyrfr
  running build_ext
  building '_regression' extension
  swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
  swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
  creating build/temp.linux-x86_64-3.5
  creating build/temp.linux-x86_64-3.5/pyrfr
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I./include -I/usr/include/python3.5m -c pyrfr/regression_wrap.cpp -o build/temp.linux-x86_64-3.5/pyrfr/regression_wrap.o -O2 -std=c++11
  cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
  x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/pyrfr/regression_wrap.o -o build/lib.linux-x86_64-3.5/_regression.cpython-35m-x86_64-linux-gnu.so
  building '_util' extension
  swigging pyrfr/util.i to pyrfr/util_wrap.cpp
  swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/util_wrap.cpp pyrfr/util.i
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I./include -I/usr/include/python3.5m -c pyrfr/util_wrap.cpp -o build/temp.linux-x86_64-3.5/pyrfr/util_wrap.o -O2 -std=c++11
  cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
  x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/pyrfr/util_wrap.o -o build/lib.linux-x86_64-3.5/_util.cpython-35m-x86_64-linux-gnu.so
  installing to build/bdist.linux-x86_64/wheel
  running install
  running install_lib
  creating build/bdist.linux-x86_64
  creating build/bdist.linux-x86_64/wheel
  copying build/lib.linux-x86_64-3.5/_util.cpython-35m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel
  copying build/lib.linux-x86_64-3.5/_regression.cpython-35m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel
  creating build/bdist.linux-x86_64/wheel/pyrfr
  copying build/lib.linux-x86_64-3.5/pyrfr/__init__.py -> build/bdist.linux-x86_64/wheel/pyrfr
  copying build/lib.linux-x86_64-3.5/pyrfr/docstrings.i -> build/bdist.linux-x86_64/wheel/pyrfr
  running install_egg_info
  running egg_info
  writing pyrfr.egg-info/PKG-INFO
  writing dependency_links to pyrfr.egg-info/dependency_links.txt
  writing top-level names to pyrfr.egg-info/top_level.txt
  warning: manifest_maker: standard file '-c' not found
  
  reading manifest file 'pyrfr.egg-info/SOURCES.txt'
  reading manifest template 'MANIFEST.in'
  writing manifest file 'pyrfr.egg-info/SOURCES.txt'
  Copying pyrfr.egg-info to build/bdist.linux-x86_64/wheel/pyrfr-0.7.0.egg-info
  running install_scripts
  Checking .pth file support in build/bdist.linux-x86_64/wheel/
  /usr/bin/python3 -E -c pass
  TEST FAILED: build/bdist.linux-x86_64/wheel/ does NOT support .pth files
  error: bad install directory or PYTHONPATH
  
  You are attempting to install a package to a directory that is not
  on PYTHONPATH and which Python does not read ".pth" files from.  The
  installation directory you specified (via --install-dir, --prefix, or
  the distutils default setting) was:
  
      build/bdist.linux-x86_64/wheel/
  
  and your PYTHONPATH environment variable currently contains:
  
      ''
  
  Here are some of your options for correcting the problem:
  
  * You can choose a different installation directory, i.e., one that is
    on PYTHONPATH or supports .pth files
  
  * You can add the installation directory to the PYTHONPATH environment
    variable.  (It must then also be on PYTHONPATH whenever you run
    Python and want to use the package(s) you are installing.)
  
  * You can set up the installation directory to support ".pth" files by
    using one of the approaches described here:
  
    https://pythonhosted.org/setuptools/easy_install.html#custom-installation-locations
  
  Please make the appropriate changes for your system and try again.
  
  ----------------------------------------
  Failed building wheel for pyrfr
  Running setup.py clean for pyrfr
Failed to build pyrfr
Installing collected packages: pyrfr
  Running setup.py install for pyrfr ... done
  Could not find .egg-info directory in install record for pyrfr from https://pypi.python.org/packages/25/47/5601a312c7f2edfdcb3aa01ed66575dfa8d8e22b0fd7641e4a5f4ce61c2d/pyrfr-0.7.0.tar.gz#md5=946e34432ba98b40ee9df25775f923a1
Successfully installed pyrfr-0.7.0
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@1f73fbac46e1:/# pip3 list
pip (8.1.1)
pyrfr (0.7.0)
setuptools (20.7.0)
wheel (0.29.0)
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@1f73fbac46e1:/# pip3 uninstall pyrfr
Uninstalling pyrfr-0.7.0:
  /usr/local/lib/python3.5/dist-packages/pyrfr-0.7.0-py3.5-linux-x86_64.egg
Proceed (y/n)? y
  Successfully uninstalled pyrfr-0.7.0
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@1f73fbac46e1:/# pip3 list
pip (8.1.1)
pyrfr (0.7.0)
setuptools (20.7.0)
wheel (0.29.0)
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@1f73fbac46e1:/# pip3 uninstall pyrfr
Uninstalling pyrfr-0.7.0:
  /usr/local/lib/python3.5/dist-packages/pyrfr
  /usr/local/lib/python3.5/dist-packages/pyrfr-0.7.0.egg-info
Proceed (y/n)? y
  Successfully uninstalled pyrfr-0.7.0
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@1f73fbac46e1:/#

You can reproduce the issue with the following commands on a blank Ubuntu 16.04 without python:

apt-get update
apt-get install build-essential wget git python3-pip python3-dev swig3.0 swig
pip3 install pyrfr
pip3 list
pip3 uninstall pyrfr
pip3 list
pip3 uninstall pyrfr

Swig 4.0 nondynamic support causing Segmentation fault

Problem Statement

Random Forest run requires nondynamic support in two places

  • Interface to swig here.
  • And in the setup.py here.

Which causes the following segmentation fault:

# To make this trace one can install apt install python3-dbg gdb
# Have this dummy file
# cat reproduce.py 
#     import pyrfr.regression as reg
#     data = reg.default_data_container(64)
# 
# and run gdb -ex r --args python3 reproduce.py
(gdb) backtrace 
#0  0x00000000005cec00 in PyDict_SetItem (op=op@entry=0x0, key='this', 
    value=value@entry=<SwigPyObject at remote 0x7f9bd68cd180>) at ../Objects/dictobject.c:1526
#1  0x00007f9bd66ecf6f in SWIG_Python_SetSwigThis (
    swig_this=<SwigPyObject at remote 0x7f9bd68cd180>, inst=<optimized out>)
    at pyrfr/regression_wrap.cpp:1979
#2  SWIG_Python_InitShadowInstance (args=<optimized out>) at pyrfr/regression_wrap.cpp:2309

Next Steps

This appears to be a problem with our usage of nondynamic support as tracked here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.