Code Monkey home page Code Monkey logo

Comments (16)

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

related to relu activation function...i guess the results are too large....problem goes away with tanh.

from reservoirpy.

nTrouvain avatar nTrouvain commented on May 24, 2024

Hello,
Indeed this is probably linked to relu function. Could you try with id function to create a linear reservoir ?
Also, what is the size of the reservoir you are using ?

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

identity activation function sort of works. I get far superior results using the identity function. My forecast error gets cut by 2/3's but I get this warning:

/home/chaos/.local/lib/python3.8/site-packages/reservoirpy/nodes/ridge.py:17: LinAlgWarning: Ill-conditioned matrix (rcond=1.61597e-218): result may not be accurate.
return linalg.solve(XXT + ridge, YXT.T, assume_a="sym")

Should the warning be heeded even though the forecast error is far superior with deepesn?

My shallow esn was 5000 neurons which was optimal. Literature suggests to take that same number and divide it among layers rather than increase or decrease the total number of neurons across layers, which is what I did.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

results are a little unbelievable imo

from reservoirpy.

nTrouvain avatar nTrouvain commented on May 24, 2024

The warning probably comes from insufficient regularization: does it still appear when you increase the ridge parameter ?
Also, be warned that linear ESN are usually less efficient than non linear ones, using sigmoid or tanh for instance as activation.
I admit that I do not really understand why you are using linear activation, or even relu. I can't find any trace of this in the Gallichio paper about DeepESN. tanh is for sure a better choice to maintain numerical stability.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

literature states to use tanh so I originally did. I just experimented with identity in the context of deepesn because before with shallow esn it did nothing.....I don't know why identity works better than tanh but I can't really ignore the forecast error that drops by 2/3's cause thats a big inaccuracy and the forecast error improvement doesn't lie regardless.

I'll test out the ridge parameter in a bit. Right now I'm running some forecasts with the identity function.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

My data is heavily preprocessed...including PCA to denoise the data....PCA + deepesn might be a strong possibility why the error improved considerably but honestly I have no idea why.

from reservoirpy.

nTrouvain avatar nTrouvain commented on May 24, 2024

How have you chosen the hyperparameters, like the leak rate, the activation function, the spectral radius, the regularization noise or the ridge ? Have you try to optimize them with and without PCA ?

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

yes....definitely without PCA. With PCA + shallow I get minor improvements.

I manually did a search of the hyper parameters. I looked for various loss valleys manually and honed in on some areas that presented itself. I tested all the parameters you mentioned above.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

Before 0.30 there were too many problems with unattended running of reservoirpy 0.2.4 so I did it manually. Problems are all gone in 0.3.0.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

DeepESN blows away my models involving convolutional layers.

from reservoirpy.

neuronalX avatar neuronalX commented on May 24, 2024

Hi @FuzzyWuzzyIsABear,
Thanks for using ReservoirPy!

Could you give use the hyperparameters you use for the reservoir parts? It would help us understand the problems you face.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

I'm currently testing hyperparameters right now...now that its memory and disk issues are gone (not sure about disk but not an issue) trying to fit it in a GA algorithm as evolution itself is chaotic.

I started off with:

        numCells = 5000
        activation = identity
        proba = 0.001
        ridge = 0.007
        sr = 1.3 
        lr = 0.2

There's a definite improvement with DeepESN....there could be more with proper noise & feedback connections. With shallow esn's, I got a nice boost with feedback. When I initial ran DeepESN and found a significant improvement I didn't save the seed number. Right now I'm just amassing forecasts and comparing the mode of the distribution.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

The warning probably comes from insufficient regularization: does it still appear when you increase the ridge parameter ? Also, be warned that linear ESN are usually less efficient than non linear ones, using sigmoid or tanh for instance as activation. I admit that I do not really understand why you are using linear activation, or even relu. I can't find any trace of this in the Gallichio paper about DeepESN. tanh is for sure a better choice to maintain numerical stability.

ridge parameter didn't help. The problem is with the identity function and/or the data i guess. switching to tanh is the only fix i can find.

from reservoirpy.

FuzzyWuzzyIsABear avatar FuzzyWuzzyIsABear commented on May 24, 2024

part of my data processing involves using the tanh estimator to scale everything that plus pca plus a dozen other things is probably setting off the linear algebra warnings.

from reservoirpy.

neuronalX avatar neuronalX commented on May 24, 2024

Hi @FuzzyWuzzyIsABear, did you consider looking at our tutorial on optimizing hyperparameters for reservoirs?
https://github.com/reservoirpy/reservoirpy/tree/master/tutorials/Optimization%20of%20hyperparameters

We recommend using random search at the beginning to explore and understand graphically what are the important ranges for parameters and what are the relations between variables.

We also did a paper on a general method to follow in the same idea as the tutorial:
Hinaut, X., & Trouvain, N. (2021, September). Which hype for my new task? hints and random search for Echo State Networks hyperparameters. In International Conference on Artificial Neural Networks (pp. 83-97). Springer, Cham.
You can find the preprint PDF here: https://hal.inria.fr/hal-03203318/document

from reservoirpy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.