darioizzo / dcgp Goto Github PK
View Code? Open in Web Editor NEWImplementation of a differentiable CGP (Cartesian Genetic Programming)
License: GNU General Public License v3.0
Implementation of a differentiable CGP (Cartesian Genetic Programming)
License: GNU General Public License v3.0
Invoking https://github.com/darioizzo/dcgp/blob/master/doc/examples/getting_started.py
yields
Traceback (most recent call last):
File "getting_started.py", line 20, in <module>
print("Expression:", ex(in_sym)[0])
NameError: name 'in_sym' is not defined
and after defining in_sym = ["x"]
the sympy module cannot be found.
Where do I see the requirements.txt or similar list of python dependencies which will be installed and do you have additional dependencies to be installed for the examples to work?
Setup: dcgpy v1.2.1 from PyPI installed via pip
The following code will throw a TypeError
. I believe the reason is that expression.get_arity()
now returns a list with arities per node instead of a single int.
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from dcgpy import expression_weighted_gdual_vdouble as dcgpy_expression
kernels = ['sum', 'diff']
n_in = 1
n_out = 1
rows = 1
cols = 15
levels_back = 16
arity = 2
kernels = kernel_set(kernels)()
ex = dcgpy_expression(n_in, n_out, rows=rows, cols=cols,
levels_back=levels_back, arity=arity,
kernels=kernels)
ex.simplify(['x'], subs_weights=True)
from dcgpy import expression_gdual_vdouble
from dcgpy import kernel_set_gdual_vdouble
kernels = kernel_set_gdual_vdouble(['sum', 'mul'])()
expr = expression_gdual_vdouble(inputs=3, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = 4)
now type(expr(['x','y','z'])[0])
is str while print(type(expr.simplify(['x','y','z'])[0]))
is sympy.core.add.Add. Would be better to have it consistent?
At the moment the following code:
dCGP = expression(inputs=1, outputs=2, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = 13)
print("Simplified expression: ", dCGP.simplify(["x"]))
returns only the simplified expression corresponding to the first output node. It should, instead, return a list of symbolic simplified expression (one per output node). This would also be consistent with the return type of the call operator.
Hi,
Fantastic package! I just started using it and it looks like a very useful alternative to Eureqa ever since they stopped giving out free academic licenses.
I just wanted to note that I'm unable to get the symbolic regression example working as it is (though some modifications fix it). When I declare the kernels, it says that the pdiv kernel is unimplemented:
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from pyaudi import gdual_vdouble as gdual
import pyaudi
from matplotlib import pyplot as plt
import numpy as np
from random import randint
%matplotlib inline
kernels = kernel_set(["sum", "mul", "diff", "pdiv"])()
Gives:
ValueError: Unimplemented function pdiv for this type
I can change pdiv to div and it works, but I was wondering what I was missing to get pdiv working.
Let me know if you'd like more debugging information. I'm installed from pip on Python 3.7.3, with:
dcgpy.__version__: '1.2.1'
pyaudi.__version__: '1.6.4'
numpy.__version__: '1.16.2'
sympy.__version__: '1.4'
Thanks!
Miles
The following computation is wrong. It should not be zero when x is 0 and y is 1:
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from pyaudi import gdual_vdouble as gdual
kernels = kernel_set(["sum", "mul", "div", "log"])()
dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = 13)
x = [1, 0, 0, 1, 1, 1, 2, 1, 1, 0, 2, 3, 0, 4, 5, 3, 6, 4, 0, 2, 6, 1, 6, 3, 3, 8, 7, 3, 0, 5, 1, 3, 11, 2, 11, 2, 2, 7, 2, 1, 5, 4, 3, 12, 2, 7]
x = dCGP.set(x)
print(dCGP.simplify(["x", "y"]))
#out[6] = [log(x**2 + y**2 + 1)]
dCGP([gdual([0]), gdual([1])])
#Out[7]: [[0.693147]]
dCGP([gdual([1]), gdual([0])])
#Out[8]: [0]
Nice looking package! I'm eager to try it out as an alternative to Eureqa. Here are the issues I am facing:
Describe the bug
2 issues:
---------------------------------------------------------------------------
ArgumentError Traceback (most recent call last)
<ipython-input-9-0a362e511eb6> in <module>
1 ss = dcgpy.kernel_set_double(["sum", "diff", "mul", "pdiv"])
2 udp = dcgpy.symbolic_regression(points=X, labels=Y, kernels=ss())
----> 3 uda = dcgpy.es4cgp(gen=10000, max_mut=4)
4 prob = pg.problem(udp)
5 algo = pg.algorithm(uda)
ArgumentError: Python argument types in
es4cgp.__init__(es4cgp)
did not match C++ signature:
__init__(_object*, unsigned int gen=1, unsigned int mut_n=1, double ftol=0.0001, bool learn_constants=True, unsigned int seed)
__init__(_object*, unsigned int gen=1, unsigned int mut_n=1, double ftol=0.0001, bool learn_constants=True)
__init__(_object*)
To Reproduce
conda-forge install with Python 3.7, then run the tutorial.
Environment (please complete the following information):
Is it possible to evolve the topology of a neural network through mutation? (I know the implementation does not do crossover) Maybe I am not following the python code correctly, but it appears that in the feed forward neural network example, the following creates a fixed topology. If this was possible, the implementation could be very powerful (the code as it is is also quite powerful).
dcgpann = dcgpy.encode_ffnn(2,1,[50,20,10],["sig", "sig", "sig", "sum"], 5)
Invoking https://github.com/darioizzo/dcgp/blob/master/doc/examples/getting_started.py
yields
Traceback (most recent call last):
File "getting_started.py", line 33, in
print("Expression in x=1.2:", ex([x])[0])
TypeError: call(): incompatible function arguments. The following argument types are supported:
1. (self: dcgpy.core.expression_gdual_double, arg0: List[audi::gdual<double, obake::polynomials::d_packed_monomial<unsigned long long, 8u, void> >]) -> List[audi::gdual<double, obake::polynomials::d_packed_monomial<unsigned long long, 8u, void> >]
2. (self: dcgpy.core.expression_gdual_double, arg0: List[str]) -> List[str]
When I comment out lines 33 and 34, i.e. both that use "x = gdual(1.2, "x", 2)". It runs through.
I simply ran:
conda config --add channels conda-forge
conda install dcgp-python
On a new conda environment. Its hard to tell whats going on. But I run into the same problem when trying out one of the ODE problems.
I furthermore had to install graphiz in order to make it run this far in the first place.
I have a problem with evaluation of expression_gdual_vdouble on a gdual_vdouble. If I run the following program
from pyaudi import gdual_vdouble as gdual
import pyaudi
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
import numpy as np
kernels = kernel_set(["sum", "mul", "diff", "div"])()
dCGP = expression(1, 1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed=np.random.randint(1233456))
x = np.linspace(0,1,10)
x = gdual(x)
dCGP([x])
I get the following error:
TypeError: No registered converter was able to produce a C++ rvalue of type std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > from this Python object of type gdual_vdouble
The docs say that the dcgpy.expression_ann
can take ephemeral constants, however the python constructor does not allow it.
https://darioizzo.github.io/dcgp/docs/python/expression_ann.html
Some of the example code does not work, for example
>>> from dcgpy import *
>>> dcgp = expression_double(1,1,1,10,11,2,kernel_set(["sum","diff","mul","div"])(), 0u, 32u)
>>> print(dcgp)
does not work (there is no "expression_double" anymore).
There are inconsistencies in the signatures of some functions, e.g.:
Symbolic regression wants to have columns
https://darioizzo.github.io/dcgp/docs/python/symbolic_regression.html
dCGP-expressions want to have cols
https://darioizzo.github.io/dcgp/docs/python/expression.html
Hi,
I want to add "constants" to the function set. My first guess is to use a function that return a random variables, but the problem is that it return a different values every time it is called.
There is a way to add constants?
By the way, now the only way to add functions to the function set is using "names". Maybe it is better to add a method to directly add functions. If you think is a good idea I can make a PR.
Thanks
Alberto
dcgpann = dcgpy.encode_ffnn(1, 1, [5] , ['sig', 'sum'] , levels_back=1)
dcgpann.randomise_biases()
This network has 7 active nodes (1 input, 1 output + 5 hidden).
There should be 6 biases (5 hidden + 1 output)
The input node should not have a bias.
Querying for node_id = 0
should thus not give a bias.
But: dcgpann.get_bias(0)
gives the bias of the node with id 1 instead. This is confusing and currently inconsistent with the documentation.
Moreover, It is possible to query for biases outside the bounds, potentially leading to undefined behavior.
dcgpann.get_bias(42)
Describe the bug
Large numbers of ephemeral constants causes Python to crash when doing symbolic regression.
To Reproduce
Steps to reproduce the behavior:
n_eph
to be 5 or larger in the argument to dcgpy.symbolic_regression
.Screenshots
Here is the log, with verbosity=1000 (I'm not sure what the highest verbosity level is)
2aab23143000-2aab23144000 ---p 00000000 00:00 0
2aab23144000-2aab23384000 rw-p 00000000 00:00 0
2aab24000000-2aab24021000 rw-p 00000000 00:00 0
2aab24021000-2aab28000000 ---p 00000000 00:00 0
2aab28000000-2aab28021000 rw-p 00000000 00:00 0
2aab28021000-2aab2c000000 ---p 00000000 00:00 0
555555554000-5555555af000 r--p 00000000 00:2c 4221074 /mnt/home/mcranmer/miniconda3/envs/dcgpy/bin/python3.7
5555555af000-555555788000 r-xp 0005b000 00:2c 4221074 /mnt/home/mcranmer/miniconda3/envs/dcgpy/bin/python3.7
555555788000-55555582f000 r--p 00234000 00:2c 4221074 /mnt/home/mcranmer/miniconda3/envs/dcgpy/bin/python3.7
55555582f000-555555832000 r--p 002da000 00:2c 4221074 /mnt/home/mcranmer/miniconda3/envs/dcgpy/bin/python3.7
555555832000-55555589b000 rw-p 002dd000 00:2c 4221074 /mnt/home/mcranmer/miniconda3/envs/dcgpy/bin/python3.7
55555589b000-555559556000 rw-p 00000000 00:00 0 [heap]
7ffffffdc000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Gen: Fevals: Best loss: Ndf size: Compl.:
0 0 1.41375 4 7
[I 22:01:59.528 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
Environment (please complete the following information):
conda create --name dcgpy --channel conda-forge python=3.7 numpy scipy matplotlib cython h5py pyfftw notebook tqdm jupyter jupyterlab mpi4py openmpi astropy ipywidgets dcgp-python -y
Versions:
python: 3.7.5
dcgpy: 1.5
pygmo: 2.15.0
Describe the bug
For some reason, loss(X,y, "CE") always returns 0.0 but works fine for MSE.
To Reproduce
Expected behavior
A non-zero cross-entropy loss.
Environment (please complete the following information):
In cases where there are a lot of input nodes (but a relatively small network) the "try and discard"-while loops in this area are somewhat inefficient.
Hello,
I am sorry if I am missing something obvious but I have a lot of trouble to define my custom kernel. I would like to use the Lambert W function as a kernel. It is implemented in boost.
I tried to adapt the first C++ example by adding the kernel as specified in the documentation for kernel and in the one for kernel_set as following:
#include <boost/math/special_functions/lambert_w.hpp>
// [...]
template <typename T>
inline T my_lambertw(const std::vector<T> &in) {
return boost::math::lambert_w0(in[0]);
}
inline std::string print_my_lambertw(const std::vector<std::string> &in) {
std::string retval(in[0]);
return "Lw(" + retval + ")";
}
int main() {
// [...]
kernel<double> f(my_lambertw<double>, print_my_lambertw, "my_lambertw");
kernel_set<double> kernels({"sum", "diff", "mul", "pdiv"});
kernels.push_back(f);
symbolic_regression udp(X, Y, 1, 20, 21, 2, kernels(), n_eph);
// [...]
}
The compilation succeeds but at execution the program fails with
terminate called after throwing an instance of 'std::invalid_argument'
what(): Unimplemented function my_lambertw for this type
Aborted
The error makes me think that the kernel should maybe be implemented for dual numbers as well? But I don't know how to do it.
Could you please help me to understand the issue? When I understand how to do it, I would be very happy to complete the documentation or write a short example exhibiting custom kernels to help future users.
Thank you very much in advance!
Hi,
I installed dcgpy as described and end up getting a ModuleNotFoundError
conda create -n dcgpy-dev python=3.8
conda activate dcgpy-dev
conda install dcgp-python -c conda-forge
python -c "from dcgpy import test; test.run_test_suite(); import pygmo; pygmo.mp_island.shutdown_pool(); pygmo.mp_bfe.shutdown_pool()"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/mq/repo/github/dcgp/dcgpy/__init__.py", line 2, in <module>
from ._version import __version__
ModuleNotFoundError: No module named 'dcgpy._version'
Hi @darioizzo,
I was wondering if you had any hyperparameter tuning advice for symbolic regression? I've read descriptions of hyperparameters, but don't have much intuition for what they should be chosen to be.
For a given multi-objective symbolic regression problem, how should I select the following?
I've started with the ones given in the examples. But I have no intuition for how to tune them to a different problem. Furthermore, if I encounter one of the following situations, how should I re-tune my parameters?
Thanks!
Miles
I was missing the environment creation in the conda command. Now I am able to install everything and run the python tests referenced on the repository. Thank you very much for the help. I think you should change the installation command in the README to the one that you mentioned (that includes the environment creation). I am not very familiar with conda and there could be readers who are also not experts.
But I am still unable to build the git package. The error is as follows:
CMake Error at /Users/shah/Documents/anaconda/anaconda3/envs/dcgpy/lib/cmake/audi/audi-config.cmake:9 (find_package): By not providing "Findobake.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "obake", but CMake did not find one. Could not find a package configuration file provided by "obake" with any of the following names: obakeConfig.cmake obake-config.cmake Add the installation prefix of "obake" to CMAKE_PREFIX_PATH or set "obake_DIR" to a directory containing one of the above files. If "obake" provides a separate development package or SDK, be sure it has been installed. Call Stack (most recent call first): CMakeLists.txt:193 (find_package)
In the plot of keras vs dCGP, is the x-axis wrongly labeled as RMSE? Should it be epochs?
https://darioizzo.github.io/dcgp/notebooks/dCGPANNs_for_function_approximation.html
If symbolic expressions become too convoluted/long during evolution, the log-file grows extremely in size until it fills up the memory, leading to poor performance and freezes. The following code can produce this behavior:
import dcgpy
import pygmo as pg
X, Y = dcgpy.generate_salutowicz()
kernels = dcgpy.kernel_set_double(["sum", "diff", "mul", "cos"])
udp_dcgpy = dcgpy.symbolic_regression(
points = X,
labels = Y,
kernels=kernels(),
rows = 1,
cols = 100,
n_eph = 0,
levels_back = 5, # this setting creates convoluted and complex expressions
multi_objective=False)
prob = pg.problem(udp_dcgpy)
uda = dcgpy.es4cgp(gen = 100, max_mut = 4)
algo = pg.algorithm(uda)
#outcomment the following line to eat up your memory
#algo.set_verbosity(10)
pop = pg.population(prob, 4)
pop = algo.evolve(pop)
I am trying to build the C++ examples. Here is the cmake command I am using:
cmake -DCMAKE_PREFIX_PATH=/Users/shah/Documents/anaconda/anaconda3/envs/dcgpy/ -DCMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES=/Users/shah/Documents/anaconda/anaconda3/envs/dcgpy/include/ .
The Eigen library is installed in an eigen3 directory, and it appears that it is unable to find headers inside the eigen3/Eigen/ directory.
/Users/shah/Documents/anaconda/anaconda3/envs/dcgpy/include/audi/invert_map.hpp:4:10: fatal error: 'Eigen/Dense' file not found #include ^~~~~~~~~~~~~
Describe the bug
In contrast to es4cgp
, which can be constraint to run single-core by setting parallel_batches = 0
in the corresponding symbolic regression udp, the memetic variant mes4cgp
does not have this option and reserves all cpus available. This makes it impossible/hard to deploy fairly on large compute servers.
To Reproduce
The behavior should be reproducible simply be running some tutorial code.
Expected behavior
mes4cgp
should respect parallel_batches
similar to es4cgp
.
Environment (please complete the following information):
Hi,
which is the best way to check if an expression depends on all its input variables?
I use CGP to solve some PDE. Sometimes the expression does not depends on all its input variables, but I want that the solution depends on all the variables. Now I check if the first derivative are zero, if so I add a penalty term. It works ( without it the evolution get stuck in non optimal solutions ) but I don't know if it is the best way.
Thanks
Alberto
from dcgpy import expression_gdual_vdouble
from dcgpy import kernel_set_gdual_vdouble
import pickle
kernels = kernel_set_gdual_vdouble(['sum', 'mul'])()
expr = expression_gdual_vdouble(inputs=3, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = 4)
pickle.dumps(expr)
Results in a RuntimeError. Cloudpickle produces no error, but also nothing that could be deserialized again.
There should be a way to retrieve individuals of a population of symbolic_regression
udps as expression_double
, expression_gdual_double
, etc.
Right now, the chromosome is accessible in raw by pop.get_x()
, but expressions would be easier to work with.
First off, I should say that in early 2023, I had no issues installing and using dcgpy on Windows with Conda.
Now, however, when importing core.pyd via SomeCondaInstallation\envs\dcgp\Lib\site-packages\dcgpy_init_.py
I consistently run into
ImportError: DLL load failed while importing core: The specified module could not be found.
The remarkable thing about it is that everything works just fine with the pyaudi core.pyd import!
The core.pyd file for dcgpy is in its place and does not differ from former successful installations on other Windows machines. So, no worries about that.
I tried with several Windows machines and different versions of Anaconda and Miniconda to no avail.
My only guess is that the issue might have started with the dcgpy conda package using Python 3.11.2, but I could easily be wrong about that.
Hi,
I can use d-CGP with c++, but when I tried to use the python interface I got the following error:
lib/python2.7/site-packages/dcgpy/_core.so: undefined symbol: _ZN5boost6python7objects23register_dynamic_id_auxENS0_9type_infoEPFNSt3__14pairIPvS2_EES5_E
I check the _core.so with ldd and it is linked with libpython2.7.so and boost python.
Have you any idea on which is the cause of this error?
Thank
Alberto
set_output_f
exists only for dcgpann-types at the moment, which is a pity. It makes it difficult to simply switch the type of dcgp-expression with something else (like expression_double). Having better control about your output is always helpful and should not be limited to ANNs only.
Hi @darioizzo,
I was wondering if there were any options for parallelism in dCGPy for symbolic regression? I thought about parallelizing the following loops:
for i in range(offsprings):
dCGP.set(best_chromosome)
cumsum=0
dCGP.mutate_active(i+1)
fit, constant[i] = err2(dCGP, x, yt, best_constants)
fitness[i] = fit
chromosome[i] = dCGP.get()
in the run_experiment
code in Python, but it appears dCGP is a global var.
Thanks,
Miles
Just wondering, if there is a way to use previous formulaes
as starting points for evolution ?
Mostly, load as initial population (vs starting from scratch every time...),
Thanks
Using expression.set_weight(node_id, input_id, w)
will always set the weight at input_id=0 of node_id=node_id to w.
Minimal working example:
import dcgpy
expr = dcgpy.expression_weighted_double(1, 1, 1, 1, 2, 2, dcgpy.kernel_set_double(['sum', 'diff', 'div', 'mul'])())
print(expr.get_weights()) # prints [1.0, 1.0]
expr.set_weight(0, 1, 0.5) # should set the second weight to 0.5
print(expr.get_weights()) # prints [0.5, 1.0]
If I see it right the issue is that in the C++ function we get the node_id but do not add the input_id when constructing the index to the weight vector.
If you agree that this is the problem and if it helps, I can prepare a pull request for that.
I've tried to install dcgpy
and pyaudi
from both pip2.7 and pip3.
Both of them threw the same error:
[bernardo@sager-pc Downloads]$ sudo pip install dcgpy
Collecting dcgpy
Could not find a version that satisfies the requirement dcgpy (from versions: )
No matching distribution found for dcgpy
[bernardo@sager-pc Downloads]$ sudo pip2 install pyaudi
Collecting pyaudi
Could not find a version that satisfies the requirement pyaudi (from versions: )
No matching distribution found for pyaudi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.