We have a Grid tuning strategy but should add a stoch

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add tuning by stochastic search,about alan-turing-institute/mlj.jl

Comments (6)

ablaom commented on May 17, 2024 2

Okay, I'm planning to implement this soon. Below is the doc-string for the implementation I am proposing. (Yes, it could be more user-friendly.) Feedback on the proposal very welcome. I plan to start this early next week (16/17 March).

Context:

Tuning section of user manual including doc strings for TunedModel (the user's main interface point for scheduling tuning), range, sampler (for wrapping ranges as samplers) and fit(::Univariate, ::ParamRange) (for fitting distributions to a range).
The tuning strategy API

Proposed doc-string for random search tuning strategy

RandomSearch(bounded=Distributions.Uniform,
             positive_unbounded=Distributions.Gamma,
             others=Normal,
             rng=Random.GLOBAL_RNG)

Instantiate a random search tuning strategy for searching over
Cartesian hyperparameter domains.

Supported ranges:

A single one-dimensional range (ParamRange object) r, or a pair
of the form (r, d), where d is a probability vector of the same
length as r.values, if r is a NominalRange, and is otherwise:
(i) any Distributions.Univariate instance; or (ii) one of the
subtypes of Distributions.Univariate listed in the table below,
for automatic fitting using Distributions.fit(d, r).
Any vector of objects of the above form

distribution types	for fitting to ranges of this type
`Arcsine`, `Uniform`, `Biweight`, `Cosine`, `Epanechnikov`, `SymTriangularDist`, `Triweight`	bounded
`Gamma`, `InverseGaussian`, `Poisson`	positive
`Normal`, `Logistic`, `LogNormal`, `Cauchy`, `Gumbel`, `Laplace`	any

ParamRange objects are constructed using the range method.

Example range 1:

range(model, :hyper1, lower=1, origin=2, unit=1)

Example range 2:

[(range(model, :hyper1, lower=1, upper=10), Arcsine),
  range(model, :hyper2, lower=2, upper=4),
  (range(model, :hyper2, lower=2, upper=4), Normal(0, 3)),
  range(model, :hyper3, values=[:ball, :tree], [0.3, 0.7])]

Note: All the field values of the ParamRange objects (:hyper1,
:hyper2, :hyper3 in the preceding example) must refer to field
names a of single model (the model specified during TunedModel
construction).

Algorithm

Models for evaulation are generated by sampling each range r using
rng(s) where, s = sampler(r, d). See sampler for details. If d
is not specified, then sampling is uniform (with replacement) in the
case of a NominalRange, and is otherwise given by the defaults
specified by the tuning strategy parameters bounded,
positive_unbounded, and other, depending on the NumericRange
type.

See also TunedModel, range, sampler.

from mlj.jl.

fkiraly commented on May 17, 2024

would it be worth writing up the design for tuning strategies?
You mention a "best" method next to the obvious "fit/predict".

However, should the best strategy (as per tuning) not be queriable via a "fitted model" interface which has these as parameters? E.g., in a the same (interface) way as you would query coefficients and CI of a linear model.

from mlj.jl.

ablaom commented on May 17, 2024

Good point.

The "best strategy" is indeed implemented via a "fitted model" interface, as shown in the README.md. The best method doesn't compute anything, it just retrieves the model (i.e. hyper parameters) that the fitting (=tuning) process determined (and used for fitting the final model to all available data).

If a user wants details about a fit-result (e.g., coefficients of a linear model), then he would seek these in the report field of the corresponding machine. I could do the same here and drop best, no problem.

from mlj.jl.

ablaom commented on May 17, 2024

As to writing up design for genetic algorithm: Good idea. Do we have any volunteers?

from mlj.jl.

tlienart commented on May 17, 2024

this sounds fantastic. I'm a bit confused by your example, shouldn't the third one be a bounded distr? or do you automatically truncate over the range?

I also wonder a few things:

whether you could do extend the current syntax to pass a distribution or a sampler where a sampler is anything that can be queried and could be user defined
in light of (1) whether you could pass the history ("context") to the sampler
at the moment, unless I misunderstand something, it seems you would sample equally in all dimensions (one configuration = one sample per each HP) this makes sense however I could see interest in sampling more along specific dimensions in which case you may want to pass a number of samples?

Apologies if theses questions are poorly formulated and great work as always

from mlj.jl.

ablaom commented on May 17, 2024

@tlienart Thanks for that!

I'm a bit confused by your example, shouldn't the third one be a bounded distr? or do you automatically truncate over the range?

Yes, sampler(r, d) always creates a sampler truncated to the range, but this should be made explicit in the current docstring, thanks.

whether you could do extend the current syntax to pass a distribution or a sampler where a sampler is anything that can be queried and could be user defined

So, instead of passing r or (r, d) I pass (:lambda, s), where s is any sampler? Sounds like a good idea!

in light of (1) whether you could pass the history ("context") to the sampler

Also sounds like a nice idea but a non-trivial API complication. How would the interface for passing context to a sampler look like?

at the moment, unless I misunderstand something, it seems you would sample equally in all dimensions (one configuration = one sample per each HP) this makes sense however I could see interest in sampling more along specific dimensions in which case you may want to pass a number of samples?

I'm not sure I understand the proposal. Are you suggesting that some hyper parameters be sampled less often (ie are kept fixed while others change?). Can you explain a situation where this might be beneficial? (Assuming here that we are not leaving the realm of ordinary random sampling which does not consider history of previous evaluations.) What do you mean by "pass a number of samples"? Or do you mean samplers? Could you give me a little more detail?

from mlj.jl.

Add tuning by stochastic search about mlj.jl HOT 6 CLOSED

Comments (6)

Context:

Proposed doc-string for random search tuning strategy

Supported ranges:

Algorithm

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent