Hey guys, I'm running mnist_mlp.ipynb from examples/ and in my case,

So after 18 trials I also get some repetition <a target="_blank" rel="noopener nor

Hi Alex, I had to dig a little to figure this out. <ol dir="auto

By the way, the Genetic algorithm is not in the docs: <a href="https://parameter-sherp

I see this issue is closed in <a class="issue-link js-issue-link" data-error-text="Fai

It seems that the solution for <a class="issue-link js-issue-link" data-error-text="Fa

Why does Sherpa produce repeated hyperparameter configurations? about sherpa HOT 14 CLOSED

sherpa-ai commented on May 26, 2024 2

Why does Sherpa produce repeated hyperparameter configurations?

from sherpa.

Comments (14)

LarsHH commented on May 26, 2024 1

So after 18 trials I also get some repetition

You said that it doesn't converge to the global optimum? Do you remember what the global optimum was?
I think one issue might also be that there is more variation within the hyperparameter settings (for the good ones) than between.

from sherpa.

AlexFuster commented on May 26, 2024 1

Hi Lars,

I think that is a good and general solution. The little overhead added by the search is nothing compared to the cost of retraining a combination.

My only concern is that if you have something like this:

if unique_values:
    repeated=True
    while repeated:
        parameters=generate_hyperparameter_combination()
        repeated=already_tried(parameters)
    return parameters

you will enter in an infinite loop when algorithm converges, so make sure to detect that convergence when it happens in order to finish the study.

from sherpa.

AlexFuster commented on May 26, 2024

The repeated combinations doesn't seem to be specific of GPYOPT, I tried the algorithm sherpa.algorithms.Genetic and I'ts trying repeated combinations:

Trial 42: {'hidden_size': 264, 'n_layers': 2, 'activation': <function relu at 0x7f1ffe869268>, 'lr': 0.009493107893987409, 'dropout': 0.13462578033866046}

Trial 64: {'hidden_size': 264, 'n_layers': 2, 'activation': <function relu at 0x7f1ffe869268>, 'lr': 0.009493107893987409, 'dropout': 0.13462578033866046}

Can anyone explain why is this happening? I am confused because I know this has been tested a thousand times and I'm just missing something conceptually evident, but I can't figure out what.

from sherpa.

LarsHH commented on May 26, 2024

Hi Alex,

I had to dig a little to figure this out.

Bayesian Optimization/GPyOpt repeats trials: I am working on fixing this. Essentially to model a Sherpa Discrete variable we used a GPyOpt continuous variable and discretized it. Therefore, if GPyOpt tried 111.4 ( => 111) and it then wants to try 111.1 ( => 111) you have a repeat. It turns out that
we actually don't have to do that since GPyOpt has functionality to take care of this.
Genetic algorithm: this is a separate issue from the first and an artifact of the algorithm. If the best parameter setting does not change for a while then eventually you're bound to get a case where for all five parameters you copy them from that best parameter setting. @colladou do you have any comments? I don't exactly know what the bounds look like (

sherpa/sherpa/algorithms/core.py

Line 678 in 14bc6aa

elif (self.mutation_rate <= param_origin and param_origin < self.mutation_rate + (1 - self.mutation_rate) / 2):

) but it seems to me that the chance may not be that low to get a duplicate.

from sherpa.

AlexFuster commented on May 26, 2024

Nice, that was a tricky one.
I'm gonna try all the algorithms available at Sherpa to see if they repeat combinations

from sherpa.

AlexFuster commented on May 26, 2024

By the way, the Genetic algorithm is not in the docs: https://parameter-sherpa.readthedocs.io/en/latest/algorithms/algorithms.html#
I could just find it in the code

from sherpa.

AlexFuster commented on May 26, 2024

I tested population-based and it doesn't seem to produce repeated configurations, soI will use it instead of the Genetic algorithm.
I think we should focus on the bug with GPyOpt

from sherpa.

AlexFuster commented on May 26, 2024

Is the issue with GPyOpt in the roadmap?

from sherpa.

AlexFuster commented on May 26, 2024

I see this issue is closed in #45

from sherpa.

AlexFuster commented on May 26, 2024

It seems that the solution for #48 has caused this to happen again.
Using mnist_mlp I obtain tons of repeated combinations.
Could you please run that example and see if you can replicate this error?

from sherpa.

LarsHH commented on May 26, 2024

That is strange. GPyOpt should now be producing integer values (but as float) and Sherpa is just turning them to int afterwards. So there shouldn't be any repetition due to rounding. I'm looking into it.

from sherpa.

AlexFuster commented on May 26, 2024

It seems to reach the global optimum, which is around
{'Trial-ID': 40, 'Iteration': 6, 'activation': 'relu', 'num_units': 119, 'Objective': 0.06359722506762482}
But in my case, it ended up converging in a suboptimal:
Trial 57: {'num_units': 122, 'activation': 'sigmoid', 'Objective': 0.0702}
...
Trial 168: {'num_units': 122, 'activation': 'sigmoid', 'Objective': 0.0770}

I think, and that is a task for the user, not for sherpa, that for neural networks, the seed of the initialization of the network should be constant to every trial, since a configuration can obtain a lower loss than a better one, just by chance due to its initialization. I am going to implement it in the Keras example and I'll tell you if that fixes the convergence issue.

About the repetitions, I would just want to know if they are an artifact produced by GPyOpt or if there is any bug in Sherpa that produces them. In case it is GPyOpt's fault, It would be nice for the sherpa.study to keep a register of tried configurations and their objectives, so when the algorithm requests for a repeated configuration, the study just gives it the saved objective, instead of training a model.

from sherpa.

LarsHH commented on May 26, 2024

Hi Alex,
I'm just now revisiting this and thinking about what you said. One option would be to have a general "unique values" flag somewhere that assures that no values are repeated. I think the user should be able to specify that since people may think differently about whether they want to have repeated values or not. It would also keep the interface cleaner since repetition could in theory happen with many other algorithms too (e.g. Random Search with discrete parameter options).

from sherpa.

LarsHH commented on May 26, 2024

Closing this issue: this is now a task: https://github.com/sherpa-ai/sherpa/projects/1#card-28465658

from sherpa.

Why does Sherpa produce repeated hyperparameter configurations? about sherpa HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent