Code Monkey home page Code Monkey logo

Comments (14)

LarsHH avatar LarsHH commented on May 26, 2024 1

So after 18 trials I also get some repetition
image
You said that it doesn't converge to the global optimum? Do you remember what the global optimum was?
I think one issue might also be that there is more variation within the hyperparameter settings (for the good ones) than between.

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024 1

Hi Lars,

I think that is a good and general solution. The little overhead added by the search is nothing compared to the cost of retraining a combination.

My only concern is that if you have something like this:

if unique_values:
    repeated=True
    while repeated:
        parameters=generate_hyperparameter_combination()
        repeated=already_tried(parameters)
    return parameters

you will enter in an infinite loop when algorithm converges, so make sure to detect that convergence when it happens in order to finish the study.

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

The repeated combinations doesn't seem to be specific of GPYOPT, I tried the algorithm sherpa.algorithms.Genetic and I'ts trying repeated combinations:

Trial 42: {'hidden_size': 264, 'n_layers': 2, 'activation': <function relu at 0x7f1ffe869268>, 'lr': 0.009493107893987409, 'dropout': 0.13462578033866046}

Trial 64: {'hidden_size': 264, 'n_layers': 2, 'activation': <function relu at 0x7f1ffe869268>, 'lr': 0.009493107893987409, 'dropout': 0.13462578033866046}

Can anyone explain why is this happening? I am confused because I know this has been tested a thousand times and I'm just missing something conceptually evident, but I can't figure out what.

from sherpa.

LarsHH avatar LarsHH commented on May 26, 2024

Hi Alex,

I had to dig a little to figure this out.

  1. Bayesian Optimization/GPyOpt repeats trials: I am working on fixing this. Essentially to model a Sherpa Discrete variable we used a GPyOpt continuous variable and discretized it. Therefore, if GPyOpt tried 111.4 ( => 111) and it then wants to try 111.1 ( => 111) you have a repeat. It turns out that
    we actually don't have to do that since GPyOpt has functionality to take care of this.

  2. Genetic algorithm: this is a separate issue from the first and an artifact of the algorithm. If the best parameter setting does not change for a while then eventually you're bound to get a case where for all five parameters you copy them from that best parameter setting. @colladou do you have any comments? I don't exactly know what the bounds look like (

    elif (self.mutation_rate <= param_origin and param_origin < self.mutation_rate + (1 - self.mutation_rate) / 2):
    ) but it seems to me that the chance may not be that low to get a duplicate.

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

Nice, that was a tricky one.
I'm gonna try all the algorithms available at Sherpa to see if they repeat combinations

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

By the way, the Genetic algorithm is not in the docs: https://parameter-sherpa.readthedocs.io/en/latest/algorithms/algorithms.html#
I could just find it in the code

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

I tested population-based and it doesn't seem to produce repeated configurations, soI will use it instead of the Genetic algorithm.
I think we should focus on the bug with GPyOpt

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

Is the issue with GPyOpt in the roadmap?

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

I see this issue is closed in #45

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

It seems that the solution for #48 has caused this to happen again.
Using mnist_mlp I obtain tons of repeated combinations.
Could you please run that example and see if you can replicate this error?

from sherpa.

LarsHH avatar LarsHH commented on May 26, 2024

That is strange. GPyOpt should now be producing integer values (but as float) and Sherpa is just turning them to int afterwards. So there shouldn't be any repetition due to rounding. I'm looking into it.

from sherpa.

AlexFuster avatar AlexFuster commented on May 26, 2024

It seems to reach the global optimum, which is around
{'Trial-ID': 40, 'Iteration': 6, 'activation': 'relu', 'num_units': 119, 'Objective': 0.06359722506762482}
But in my case, it ended up converging in a suboptimal:
Trial 57: {'num_units': 122, 'activation': 'sigmoid', 'Objective': 0.0702}
...
Trial 168: {'num_units': 122, 'activation': 'sigmoid', 'Objective': 0.0770}

I think, and that is a task for the user, not for sherpa, that for neural networks, the seed of the initialization of the network should be constant to every trial, since a configuration can obtain a lower loss than a better one, just by chance due to its initialization. I am going to implement it in the Keras example and I'll tell you if that fixes the convergence issue.

About the repetitions, I would just want to know if they are an artifact produced by GPyOpt or if there is any bug in Sherpa that produces them. In case it is GPyOpt's fault, It would be nice for the sherpa.study to keep a register of tried configurations and their objectives, so when the algorithm requests for a repeated configuration, the study just gives it the saved objective, instead of training a model.

from sherpa.

LarsHH avatar LarsHH commented on May 26, 2024

Hi Alex,
I'm just now revisiting this and thinking about what you said. One option would be to have a general "unique values" flag somewhere that assures that no values are repeated. I think the user should be able to specify that since people may think differently about whether they want to have repeated values or not. It would also keep the interface cleaner since repetition could in theory happen with many other algorithms too (e.g. Random Search with discrete parameter options).

from sherpa.

LarsHH avatar LarsHH commented on May 26, 2024

Closing this issue: this is now a task: https://github.com/sherpa-ai/sherpa/projects/1#card-28465658

from sherpa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.