Code Monkey home page Code Monkey logo

parameterhandling.jl's Introduction

ParameterHandling

Stable Dev CI Codecov ColPrac: Contributor's Guide on Collaborative Practices for Community Packages

ParameterHandling.jl is an experiment in handling constrained tunable parameters of models.

The Parameter Handling Problem

Consider the following common situation: you have a function build_model that maps a collection of parameters θ to a model of some kind:

model = build_model(θ)

The model might, for example, be a function that maps some input x to some sort of prediction y:

y = model(x)

where x and y could essentially be anything that you like. You might also wish to somehow "learn" or "tune" or "infer" the parameters θ by plugging build_model into some other function, lets call it learn, that tries out various different parameter values in some clever way and determines which ones are good -- think loss minimisation / objective maximisation, (approximate) Bayesian inference, etc. We'll not worry about exactly what procedure learn employs to try out a number of different parameter values, but suppose that learn has the interface:

learned_θ = learn(build_model, initial_θ)

So far so good, but now consider how one actually goes about writing build_model. There are more or less two things that must be written:

  1. θ must be in a format that learn knows how to handle. A popular approach is to require that θ be a Vector of Real numbers -- or, rather, some concrete subtype of Real.
  2. The code required to turn θ into model inside build_model mustn't be too onerous to write, read, or modify.

While the first point is fairly straightforward, the second point is a bit subtle, so it's worth dwelling on it a little.

For the sake of concreteness, let's suppose that we adopt the convention that θ is a Vector{Float64}. In the case of linear regression, we might assume that θ comprises a length D "weight vector" w, and a scalar "bias" b. So the function to build the model might be something like

function build_model::Vector{Float64})
    return x -> dot(θ[1:end-1], x) + θ[end]
end

The easiest way to see that this is a less than ideal solution is to consider what this function would look like if θ was, say, a NamedTuple with fields w and b:

function build_model::NamedTuple)
    return x -> dot.w, x) + θ.b
end

This version of the function is much easier to read -- moreover if you want to inspect the values of w and b at some other point in time, you don't need to know precisely how to chop up the vector.

Moreover it seems probable that the latter approach is less bug-prone -- suppose for some reason one refactored the code so that the first element of θ became b and the last D elements w; any code that depended upon the original ordering will now be incorrect and likely fail silently. The NamedTuple approach simply doesn't have this issue.

Granted, in this simple case it's not too much of a problem, but it's easy to find situations in which things become considerably more difficult. For example, suppose that we instead had pretty much any kind of neural network, Gaussian process, ODE, or really just any model with more than a couple of distinct parameters. From the perspective of writing complicated models, implementing things in terms of a single vector of parameters that is manually chopped up is an extremely bad design choice. It simply doesn't scale.

However, a single vector of e.g. Float64s is extremely convenient when writing general purpose optimisers / approximate inference routines -- Optim.jl and AdvancedHMC.jl being two obvious examples.

The ParameterHandling.jl Approach

ParameterHandling.jl aims to give you the best of both worlds by providing the tools required to automate the transformation between a "structured" representation (e.g. nested NamedTuple / Dict etc) and a "flattened" (e.g. Vector{Float64}) of your model parameters.

The function flatten eats a structured representation of some parameters, returning the flattened representation and a function that converts the flattened thing back into its structured representation.

flatten is implemented recursively, with a very small number of base-implementations that don't themselves call flatten.

You should expect to occassionally have to extend flatten to handle your own types and, if you wind up doing this for a function in Base that this package doesn't yet cover, a PR including that implementation will be very welcome.

See test/parameters.jl for a couple of examples that utilise flatten to do something similar to the task described above.

Dealing with Constrained Parameters

It is very common to need to handle constraints on parameters e.g. it may be necessary for a particular scalar to always be positive. While flatten is great for changing between representations of your parameters, it doesn't really have anything to say about this constraint problem.

For this we introduce a collection of new AbstractParameter types (whether we really need them to have some mutual supertype is unclear at present) that play nicely with flatten and allow one to specify that e.g. a particular scalar must remain positive, or should be fixed across iterations. See src/parameters.jl and test/parameters.jl for more examples.

The approach to implementing these types typically revolves around some kind of Deferred / delayed computation. For example, a Positive parameter is represented by an "unconstrained" number, and a "transform" that maps from the entire real line to the positive half. The value of a Positive is given by the application of this transform to the unconstrained number. flattening a Positive yields a length-1 vector containing the unconstrained number, rather than the value represented by the Positive object. For example

julia> using ParameterHandling

julia> x_constrained = 1.0 # Specify constrained value.
1.0

julia> x = positive(x_constrained) # Construct a number that should remain positive.
ParameterHandling.Positive{Float64, typeof(exp), Float64}(-1.490116130486996e-8, exp, 1.4901161193847656e-8)

julia> ParameterHandling.value(x) # Get the constrained value by applying the transform.
1.0

julia> v, unflatten = flatten(x); # Supports the `flatten` interface.

julia> v
1-element Vector{Float64}:
 -1.490116130486996e-8

julia> new_v = randn(1) # Pick a random new value.
1-element Vector{Float64}:
 2.3482666974328716

julia> ParameterHandling.value(unflatten(new_v)) # Obtain constrained value.
10.467410816707215

We also provide the utility function value_flatten which returns an unflattening function equivalent to value(unflatten(v)). The above could then be implemented as

julia> v, unflatten = value_flatten(x);

julia> unflatten(v)
1.0

It is straightforward to implement your own parameters that interoperate with those already written by implementing value and flatten for them. You might want to do this if this package doesn't currently support the functionality that you need.

A Worked Example

We use a model involving a Gaussian process (GP) -- you don't need to know anything about Gaussian processes other than

  1. they are a class of probabilistic model which can be used for regression (amongst other things).
  2. they have some tunable parameters that are usually chosen by optimising a scalar objective function using an iterative optimisation algorithm -- typically a variant of gradient descent. Ths is representative of a large number of models in ML / statistics / optimisation.

This example can be copy+pasted into a REPL session.

# Install some packages.
using Pkg
Pkg.add("ParameterHandling")
Pkg.add("Optim")
Pkg.add("Zygote")
Pkg.add("AbstractGPs")

using ParameterHandling # load up this package.
using Optim # generic optimisation
using Zygote # algorithmic differentiation
using AbstractGPs # package containing the models we'll be working with

# Declare a NamedTuple containing an initial guess at parameters.
raw_initial_params = (
    k1 = (
        var=positive(0.9),
        precision=positive(1.0),
    ),
    k2 = (
        var=positive(0.1),
        precision=positive(0.3),
    ),
    noise_var=positive(0.2),
)

# Using ParameterHandling.value_flatten, we can obtain both a Vector{Float64} representation of
# these parameters, and a mapping from that vector back to the original (unconstrained) parameter values.
flat_initial_params, unflatten = ParameterHandling.value_flatten(raw_initial_params)

# ParameterHandling.value strips out all of the Positive types in initial_params,
# returning a plain named tuple of named tuples and Float64s.
julia> initial_params = ParameterHandling.value(raw_initial_params)
(k1 = (var = 0.9, precision = 1.0), k2 = (var = 0.10000000000000002, precision = 0.30000000000000004), noise_var = 0.19999999999999998)

# GP-specific functionality. Don't worry about the details, just
# note the use of the structured representation of the parameters.
function build_gp(params::NamedTuple)
    k1 = params.k1.var * Matern52Kernel()  ScaleTransform(params.k1.precision)
    k2 = params.k2.var * SEKernel()  ScaleTransform(params.k2.precision)
    return GP(k1 + k2)
end

# Generate some synthetic training data.
# Again, don't worry too much about the specifics here.
const x = range(-5.0, 5.0; length=100)
const y = rand(build_gp(initial_params)(x, initial_params.noise_var))

# Specify an objective function in terms of x and y.
function objective(params::NamedTuple)
    f = build_gp(params)
    return -logpdf(f(x, params.noise_var), y)
end

# Use Optim.jl to minimise the objective function w.r.t. the params.
# The important thing here is to note that we're passing in the flat vector of parameters to
# Optim, which is something that Optim knows how to work with, and using `unflatten` to convert
# from this representation to the structured one that our objective function knows about
# using `unflatten` -- we've used ParameterHandling to build a bridge between Optim and an
# entirely unrelated package.
training_results = Optim.optimize(
    objective  unflatten,
    θ -> only(Zygote.gradient(objective  unflatten, θ)),
    flat_initial_params,
    BFGS(
        alphaguess = Optim.LineSearches.InitialStatic(scaled=true),
        linesearch = Optim.LineSearches.BackTracking(),
    ),
    Optim.Options(show_trace = true);
    inplace=false,
)

# Extracting the final values of the parameters.
final_params = unflatten(training_results.minimizer)
f_trained = build_gp(final_params)

Usually you would go on to make some predictions on test data using f_trained, or something like that. From the perspective of ParameterHandling.jl, we've seen the interesting stuff though. In particular, we've seen an example of how ParameterHandling.jl can be used to bridge the gap between the "flat" representation of parameters that Optim likes to work with, and the "structured" representation that it's convenient to write optimisation algorithms with.

Gotchas and Performance Tips

  1. Integers typically don't take part in the kind of optimisation procedures that this package is designed to handle. Consequently, flatten(::Integer) produces an empty vector.
  2. deferred has some type-stability issues when used in conjunction with abstract types. For example, flatten(deferred(Normal, 5.0, 4.0)) won't infer properly. A simple work around is to write a function normal(args...) = Normal(args...) and work with deferred(normal, 5.0, 4.0) instead.
  3. Let x be an Array{<:Real}. If you wish to constrain each of its values to be positive, prefer positive(x) over map(positive, x) or positive.(x). positive(x) has been implemented the associated unflatten function has good performance, particularly when interacting with Zygote (when map(positive, x) is extremely slow). The same thing applies to bounded values. Prefer bounded(x, lb, ub) to e.g. bounded.(x, lb, ub).

parameterhandling.jl's People

Contributors

alexrobson avatar devmotion avatar fchorney avatar github-actions[bot] avatar jonniedie avatar mattbrzezinski avatar mohamed82008 avatar molet avatar rofinn avatar simsurace avatar st-- avatar theogf avatar willtebbutt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

parameterhandling.jl's Issues

RFC: pretty-printing

ParameterHandling.jl is a nice framework for organising parameters into nested NamedTuples. Whilst working with them in JuliaGaussianProcesses/AbstractGPs.jl#240, what I was missing is a nice way of printing this parameter structure. It seems like the best place to sort this out is within ParameterHandling directly, as it needs to know about the various constraints to print them properly.

So far, I've got the following stub:

using Printf

function show_params(nt::NamedTuple, pre=0)
    res = ""
    for (s, v) in pairs(nt)
        if typeof(v) <: NamedTuple
            res *= join(fill(" ", pre)) * "$(s):\n" * show_params(v, pre+4)
        else
            res *= join(fill(" ", pre)) * "$s = $(@sprintf("%.3f", v))\n"
        end
    end
    return res
end

which then gets called as print(show_params(ParameterHandling.value(param_struct))).

Ideally, I would like to avoid the call to value to remove all the constraints, and instead add the corresponding information (e.g. scale = 1.345 (positive)).

And of course would have to think a bit harder to make it work well for vectors/matrices etc... so I thought it better to first start a discussion about what might be reasonable:)

Update: Oh, and of course instead of directly printing, it should implement show so it works well both in REPL and notebooks.

Feature Request: `flatten_only`

One of the biggest problems I have when handling parameters (whether I'm using ComponentArrays, Functors, ParameterHandling, etc.) is that I often only want to optimize (or do some other analysis) over a certain subset of the parameters. And, critically, that subset is going to be constantly changing. It would be nice to have a way to tag parameters for certain operations.

One way to accomplish this with ParameterHandling would be a flatten_only function that digs through the structure and only pulls out elements wrapped in a specified type. The idea is that a user would create a wrapper type and method for value that unwraps the object. I'm not 100% sure this is the best way to handle it (maybe it would be better for ParameterHandling to have a Tagged type that allows users to tag with symbols?), but here is an idea of what it would look like to use:

struct Tunable{T}
    val::T
end

ParameterHandling.value(x::Tunable) = x.val
julia> raw_params = (
       a = 1,
       b = 2.5,
       c = Tunable(3.1),
       d = (
           a = Tunable(4),
           b = 5.1,
       ),
   );

julia> tunable_params, unflatten = flatten_only(Tunable, raw_params);

julia> tunable_params
2-element Vector{Float64}:
 3.1
 4.0

Increased AD compilation times induced by the change in #67

While the compiled function runs much faster due to type instability, for bigger tuples this leads to a blowup of compilation time for Zygote when differentiating through the unflatten function defined recursively (unsurprisingly perhaps). We should maybe add an rrule.

AbstractArray -> StridedArray

#21 essentially points out that we have the same problem that we see regularly in the AD world, where a sensible implementation of flatten for e.g. a sparse array or Fill is not the same as a sensible implementation for e.g. a StridedArray.

Should be a quick fix. We probably need to copy over + tweak the code for to_vec for general structs so that flatten gives you the struct interpretation of an array by default.

Disclaimer: I'm assuming thay if a user writes Fill(5.0, 10) then they want to constrain this vector to be a Fill throughout e.g. optimisation, and that if this weren't the case said user would write fill(5.0, 10).

Using flatten/unflatten for Automatic Differentiation

Hi there,

Really nice package!

I was wondering if one can adjust the flatten/unflatten functions, such that unflatten is also working inside a closure for using Automatic Differentation.

At the moment, it seems that the type constraints cannot handle Duals. MWE:

using ParameterHandling, Distributions, ReverseDiff, ForwardDiff

# Get sample data and parameter
val = ( μ = 1., σ = 2. )
data = randn(100)

# Write down a logdensity with parameter and data as arguments
function log_density(val, data)
        return sum( Distributions.logpdf(Distributions.Normal(val.μ, val.σ), data[iter] ) for iter in eachindex(data) )
end
log_density(val, data)

# Closure for transforming θ Vector to NamedTuple
function get_log_target(val, data)
        _, unflatten = ParameterHandling.flatten(val)
        function log_target(θ::AbstractVector{T}) where T
                return log_density( unflatten(θ), data)
        end
end

# Check if it is working
lt = get_log_target(val, data)
theta = [.1, .2]
lt(theta)

# Method Error for Dual numbers
ForwardDiff.gradient(lt, theta) # MethodError: no method matching (::ParameterHandling.var"#unflatten_to_NamedTuple#15...
ReverseDiff.gradient(lt, theta) # MethodError: no method matching (::ParameterHandling.var"#unflatten_to_NamedTuple#15"{Float64, NamedTuple{(:μ, :σ)

Zygote was actually working for this toy example, but breaks when used with more complex examples (and was not very fast).

RFC: making parameters subtypes of their originals

The AbstractParameters supertype does not seem to be used and there has not been a need for it in my experience.
What if we made the parameters subtypes of their originals, e.g. what if positive(1.0) isa Real or positive_definite(A) isa AbstractMatrix{eltype{A}}? Overloading methods to arithmetic would make it so we can stick those parameters almost everywhere and be unpacked automatically.

Unitful parameters

My parameter values are unitful and I want to use positive(x) where x is a unitful Quantity from Unitful.jl. Currently I get an error

using Unitful, ParameterHandling
julia> positive(1u"m")
ERROR: MethodError: no method matching positive(::Quantity{Int64, 𝐋, Unitful.FreeUnits{(m,), 𝐋, nothing}})

What would be a good way to proceed? Is there a recommended way I should adjust my code, or a way for ParameterHandling to handle this directly? Thanks.

let `positive_definite` return a PDMat

This would allow for avoiding a Cholesky each time e.g. when parametrising the covariance of an MvNormal. Would you be willing to accept the PDMats.jl dependency?

Support `flatten(eltype, x)`?

The fact that flatten(::Integer) -> Vector{Float64} is kind of annoying, maybe we could propagate an eltype to make it clearer what the returned array should be? An alternative might be to recursively call promote_type as we go, but that seems like more work.

default behavior for flatten/unflatten with any type

Hi,
I'm using ParameterHandling only for the flatten/unflatten tools. In my case I need to carry extra information within my tuple stored in different type that do not need to be flatten but I need to recover it in my restructured NamedTuple
Here is an example

struct MyType
       a::Float64
       b::Int
end

mynametuple = (;a=rand(2),b=randn(6),m = MyType(1,2))

If I try to flatten this tuple I get an error:

flatvector, restructure = ParameterHandling.flatten( mynametuple)
ERROR: MethodError: no method matching flatten(::Type{Float64}, ::MyType)

Closest candidates are:
  flatten(::Type{T}, ::Tuple{}) where T<:Real
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:92
  flatten(::Type{T}, ::Nothing) where T<:Real
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:22
  flatten(::Type{T}, ::Integer) where T<:Real
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:28
  ...

Stacktrace:
 [1] flatten(::Type{Float64}, x::Tuple{MyType})
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:82
 [2] flatten(::Type{Float64}, x::Tuple{Vector{Float64}, MyType})
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:83
 [3] flatten(::Type{Float64}, x::Tuple{Vector{Float64}, Vector{Float64}, MyType})
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:83
 [4] flatten(::Type{Float64}, x::@NamedTuple{a::Vector{Float64}, b::Vector{Float64}, m::MyType})
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:99
 [5] flatten(x::@NamedTuple{a::Vector{Float64}, b::Vector{Float64}, m::MyType})
   @ ParameterHandling ~/.julia/packages/ParameterHandling/WkHof/src/flatten.jl:20
 [6] top-level scope
   @ REPL[14]:1

However, with this simple overloading of ParameterHandling.flatten as a default behaviour

function ParameterHandling.flatten(::Type{T}, x) where {T<:Real}
           v = T[]
           unflatten_to_Any(::Vector{T}) = x
           return v, unflatten_to_Any
 end

I get:

julia> flatvector, restructure = ParameterHandling.flatten( mynametuple)
([0.8409925019315632, 0.2819056254106582, 1.598581283953101, 0.3921407396316173, -1.6414335944803775, 1.0503960727007877, -1.183085803897985, -2.3827810941365573], ParameterHandling.var"#unflatten_to_NamedTuple#15"{Float64, @NamedTuple{a::Vector{Float64}, b::Vector{Float64}, m::MyType}, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_empty_Tuple#14"{Float64, Tuple{}}, var"#unflatten_to_Any#6"{Float64, MyType}}, ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}}, ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}}}((a = [0.8409925019315632, 0.2819056254106582], b = [1.598581283953101, 0.3921407396316173, -1.6414335944803775, 1.0503960727007877, -1.183085803897985, -2.3827810941365573], m = MyType(1.0, 2)), ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_empty_Tuple#14"{Float64, Tuple{}}, var"#unflatten_to_Any#6"{Float64, MyType}}, ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}}, ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}}(6, 2, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_empty_Tuple#14"{Float64, Tuple{}}, var"#unflatten_to_Any#6"{Float64, MyType}}, ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}}(0, 6, ParameterHandling.var"#unflatten_to_Tuple#13"{Float64, Int64, Int64, ParameterHandling.var"#unflatten_to_empty_Tuple#14"{Float64, Tuple{}}, var"#unflatten_to_Any#6"{Float64, MyType}}(0, 0, ParameterHandling.var"#unflatten_to_empty_Tuple#14"{Float64, Tuple{}}(()), var"#unflatten_to_Any#6"{Float64, MyType}(MyType(1.0, 2))), ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}()), ParameterHandling.var"#unflatten_to_Vector#4"{Float64, Float64}())))

julia> restructure(flatvector)
(a = [0.8409925019315632, 0.2819056254106582], b = [1.598581283953101, 0.3921407396316173, -1.6414335944803775, 1.0503960727007877, -1.183085803897985, -2.3827810941365573], m = MyType(1.0, 2))

Is there any reason for not implementing such default behavior? Am I missing something?
Should I make a PR?

Different behavior for Integers and Vectors of Integers when flattened

Hi there,

I noticed a different behavior for Integers and Vectors of Integers when flattened - Integers are not flattened, but Vectors of Integers are flattened and transformed to Float64s. MWE:

using ParameterHandling
val = ( a = 1., b = 2, c = [3., 4.], d = [5, 6] )
ParameterHandling.flatten(val) # Vector{Float64} with 5 elements -> [1., 3., 4., 5., 6.]

I believe this line might cause Vector of Integers to flatten as well:

flatten(::Type{T}, x::Vector{R}) where {T<:Real, R<:Real} = (Vector{T}(x), Vector{R})

Multistart optimization

Hello,

I have a multistart optimization code that generates initial parameter guesses initialps as you can see below and later used every row of them to solve a new optimization problem.

using Distributions, LatinHypercubeSampling, Statistics
p = convert(Array{Float64,1}, 1:3)
bounds = [Vector{Float64}(undef,2) for _ in 1:length(p)]
searchgrid = [1E-9, 1E-1]
for (ipara,para) in enumerate(p)
  bounds[ipara][1] = para * searchgrid[1]
  bounds[ipara][2] = para * searchgrid[2]
end

function latinCube(bounds, dims, nguess = 100)  
    initialps = []
    plan, _ = LHCoptim(nguess, dims, 1000)
    plan /= nguess
    
    for i in 1:dims
        append!(initialps, [quantile(LogUniform(bounds[i][1], bounds[i][2]), plan[:,i])])
    end
    return permutedims(hcat(initialps...))
end

nguess = 100
initialps = latinCube(bounds, length(p), nguess)

My problem is that I do not know how to implement it on p when it is a named tuple in the format of,

p = (p₁ = 1., p₂ = fixed(2.), p₃ = bounded(3., 0, 100))

So my question is how I can do multistart optimization using ParameterHandling.

`positive(val::T) -> T`?

Calling positive(one(Float32)) should return Positive{Float32}, but it doesn't because ε is always a Float64. Maybe we should restrict the method to making val and ε the same type by default and make ε=eps(T) by default?

Comparison to TransformVariables and Bijectors?

Hi, from the README this looks very similar to TransformVariables, and to some special cases of Bijectors. How is this package different from those? It would be helpful to have some details on this in the README, to help users know which of the three packages is the best fit for their use case.

Zygote update breaking CI

Zygot 0.6.5 released yesterday which I'm fairly certain is behind the failing CI

https://github.com/invenia/ParameterHandling.jl/runs/2189676220?check_suite_focus=true

CI passed the day before using 0.6.4

https://github.com/invenia/ParameterHandling.jl/runs/2180773450?check_suite_focus=true

More generally, there are no compat bounds in the test Project.toml

Zygote: Error During Test at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:71
  Test threw exception
  Expression: only(pb(randn(3, 2))) isa Matrix{<:Real}
  ArgumentError: NamedTuple{(:U,),Tuple{Array{Float64,2}}} keys must be a subset of NamedTuple{(:V,),Tuple{Adjoint{Float64,Array{Float64,2}}}} keys
  Stacktrace:
   [1] #s77#84 at /home/runner/.julia/packages/Zygote/pM10l/src/lib/lib.jl:20 [inlined]
   [2] #s77#84(::Any, ::Any, ::Any) at ./none:0
   [3] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at ./boot.jl:543
   [4] accum(::NamedTuple{(:V,),Tuple{Adjoint{Float64,Array{Float64,2}}}}, ::Nothing, ::NamedTuple{(:U,),Tuple{Array{Float64,2}}}) at /home/runner/.julia/packages/Zygote/pM10l/src/lib/lib.jl:13
   [5] getindex at ./tuple.jl:24 [inlined]
   [6] (::typeof(∂(value)))(::Array{Float64,2}) at /home/runner/.julia/packages/Zygote/pM10l/src/compiler/interface2.jl:0
   [7] #3 at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:70 [inlined]
   [8] (::typeof(∂(#3)))(::Array{Float64,2}) at /home/runner/.julia/packages/Zygote/pM10l/src/compiler/interface2.jl:0
   [9] (::Zygote.var"#41#42"{typeof(∂(#3))})(::Array{Float64,2}) at /home/runner/.julia/packages/Zygote/pM10l/src/compiler/interface.jl:41
   [10] top-level scope at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:71
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1119
   [12] top-level scope at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:70
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1119
   [14] top-level scope at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:53
   [15] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1119
   [16] top-level scope at /home/runner/work/ParameterHandling.jl/ParameterHandling.jl/test/parameters.jl:8

RFC: unfixing

It'd be convenient to be able to unfix a parameter, I think it should satisfy the following:

unfix(p) = p
unfix(p::Fixed) = p.value

plus recursively applying unfix on nested structures, not quite sure what exactly that should look like, perhaps mapping on NamedTuple/Tuple/Dict?
What do you think ?

Incorrect type promotions with flatten

Looks like if you try to flatten a deferred function that takes an integer (probably other cases too), the returned parameters are incorrectly promoted. As a result, calling unflatten(param) throws an error. This seems like a more extreme case of the gotchas listed in the README, hence the issue.

Example)

julia> params, unflatten = flatten((a=rand(Float32, 2), b=deferred(fill, one(Float32), 4)))
([0.46068453788757324, 0.18974554538726807, 1.0], ParameterHandling.var"#unflatten_to_NamedTuple#10"{NamedTuple{(:a, :b),Tuple{Array{Float32,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{typeof(identity),ParameterHandling.var"#unflatten_Deferred#19"{ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{ParameterHandling.var"#unflatten_to_Real#2",ParameterHandling.var"#unflatten_Integer#1"{Int64}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}((a = Float32[0.46068454, 0.18974555], b = ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}(fill, (1.0f0, 4))), ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{typeof(identity),ParameterHandling.var"#unflatten_Deferred#19"{ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{ParameterHandling.var"#unflatten_to_Real#2",ParameterHandling.var"#unflatten_Integer#1"{Int64}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}},Tuple{Int64,Int64},Tuple{Int64,Int64}}((identity, ParameterHandling.var"#unflatten_Deferred#19"{ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{ParameterHandling.var"#unflatten_to_Real#2",ParameterHandling.var"#unflatten_Integer#1"{Int64}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}(ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}(fill, (1.0f0, 4)), ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{ParameterHandling.var"#unflatten_to_Real#2",ParameterHandling.var"#unflatten_Integer#1"{Int64}},Tuple{Int64,Int64},Tuple{Int64,Int64}}((ParameterHandling.var"#unflatten_to_Real#2"(), ParameterHandling.var"#unflatten_Integer#1"{Int64}(4)), (1, 0), (1, 1)))), (2, 1), (2, 3))))

julia> typeof(params)
Array{Float64,1}

julia> unflatten(params)
ERROR: MethodError: Cannot `convert` an object of type
  ParameterHandling.Deferred{typeof(fill){},Tuple{Float64,Int64{}}} to an object of type
  ParameterHandling.Deferred{typeof(fill){},Tuple{Float32,Int64{}}}
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:171
  ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}(::Any, ::Any) where {Tf, Targs} at /Users/rory/.julia/packages/ParameterHandling/W3SDV/src/parameters.jl:140
Stacktrace:
 [1] convert(::Type{Tuple{ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}}}, ::Tuple{ParameterHandling.Deferred{typeof(fill),Tuple{Float64,Int64}}}) at ./essentials.jl:310 (repeats 2 times)
 [2] Tuple{Array{Float32,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}}(::Tuple{Array{Float64,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float64,Int64}}}) at ./tuple.jl:225
 [3] NamedTuple{(:a, :b),Tuple{Array{Float32,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}}}(::Tuple{Array{Float64,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float64,Int64}}}) at ./namedtuple.jl:90
 [4] (::ParameterHandling.var"#unflatten_to_NamedTuple#10"{NamedTuple{(:a, :b),Tuple{Array{Float32,1},ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}}}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{typeof(identity),ParameterHandling.var"#unflatten_Deferred#19"{ParameterHandling.Deferred{typeof(fill),Tuple{Float32,Int64}},ParameterHandling.var"#unflatten_to_Tuple#8"{Tuple{ParameterHandling.var"#unflatten_to_Real#2",ParameterHandling.var"#unflatten_Integer#1"{Int64}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}},Tuple{Int64,Int64},Tuple{Int64,Int64}}})(::Array{Float64,1}) at /Users/rory/.julia/packages/ParameterHandling/W3SDV/src/flatten.jl:73
 [5] top-level scope at REPL[41]:1

Problems with Enzyme

I'm not sure whether this is a bug in ParameterHandling or Enzyme:

using Enzyme
using ParameterHandling

f(θ) = θ.a + θ.b

θ1 = (a = fixed(1.), b = positive(1.));
x1, unpack1 = ParameterHandling.value_flatten(θ1);
bx1 = zeros(length(x1));
autodiff(Reverse, (f, x, unpack) -> f(unpack(x)), f, Duplicated(x1, bx1), unpack1);

θ2 = (a = fixed(1.), b = positive(1.), c = fixed(1.));
x2, unpack2 = ParameterHandling.value_flatten(θ2);
bx2 = zeros(length(x2));
autodiff(Reverse, (f, x, unpack) -> f(unpack(x)), f, Duplicated(x2, bx2), unpack2);

f(unpack1(x1))  f(unpack2(x2)) # true
bx1  bx2 # false

EDIT: important note: it occurs on Julia 1.9 but not on 1.8.

Extracting flatten/unflatten into lightweight package?

In the spirit of creating lightweight interface-defining packages (see TuringLang/Bijectors.jl#199 which resulted in InverseFunctions.jl and ChangesOfVariables.jl):

While adding an interface-test utility to ChangesOfVariables.jl today, I needed a helper function _to_realvec_and_back - which does exactly what ParameterHandling.flatten does. And I needed it for exactly what #27 is requesting. :-)

It would be nice not to reinvent this, but ParameterHandling would of course be way to heavy a dependency for ChangesOfVariables - and even if it wasn't, we'd end up with a circular dependency since Bijectors will use ChangesOfVariables soon, so ParameterHandling will too.

I think the flatten/unflatten functionality of ParameterHandling does something very fundamental, and is orthogonal to it's variable transformation capabilities. Would you consider splitting it out into a lightweight package (basically the contents of flatten.jl)?

A truly lightweight recursive flatten/unflatten interface package could IMHO find use in many places in the ecosystem. flatten may be a bit too generic a name for the function (we probably have several flattens in the ecosystem), but how about flatten_and_back or so?

CC @willtebbutt, @paschermayr, @devmotion

Update: ChangesOfVariables.test_with_logabsdet_jacobian now has an additional optional argument to pass a transformation, which solves the dependency problem. Users will be able to use ParameterHandling for variable transformations during the test without a direct dependency between the two packages (after #27 is solved).

Issue with AD when using orthogonal

I'm encountering issues in using Zygote to AD through orthogonal. I note that in the present test, we do test that AD works, but as I understand it, we do not test for correctness. .

This is a MWE example that demonstrates this with FiniteDifferences. I do include a variant that should be identical in the forward pass that appears to be correct in the backward pass. I'm not too familiar with the AD ecosystem, but it is possible this should be an issue in ChainRules because that is where the rrules for SVD are defined. Perhaps the rules there are not quite catching the the algebra in nearest_orthogonal_matrix correctly?

using Zygote
using ParameterHandling
using ParameterHandling: value
using LinearAlgebra
using FiniteDifferences
using Test

function test_ad(test_function, Δoutput, inputs...; atol=1e-7, rtol=1e-7)

    # Verify that the forwards-pass produces the correct answer.
    output, pb = Zygote.pullback(test_function, inputs...)
    @test output  test_function(inputs...)

    # Compute the adjoints using AD and FiniteDifferences.
    dW_ad = pb(Δoutput)
    dW_fd = FiniteDifferences.j′vp(central_fdm(5, 1), test_function, Δoutput, inputs...)

    # Compare AD and FiniteDifferences results.
    @testset "$(typeof(test_function)) argument $n" for n in eachindex(inputs)
        @test dW_ad[n]  dW_fd[n] atol=atol rtol=rtol
    end
end

function nearest_orthogonal_matrix_variant(X::StridedMatrix{<:Union{Real, Complex}})
    svd(X).U * svd(X).V'
end

# Confirm forward pass is the same
r = rand(3,2)
@test ParameterHandling.nearest_orthogonal_matrix(r) 
    nearest_orthogonal_matrix_variant(r)

# Fails at Expression: ≈(dW_ad[n], dW_fd[n], atol = atol, rtol = rtol)
test_ad(
    ParameterHandling.nearest_orthogonal_matrix,
    randn(3,2),
    randn(3,2),
)

# Passes
test_ad(
    nearest_orthogonal_matrix_variant,
    randn(3,2),
    randn(3,2),
)
  [26cc04aa] FiniteDifferences v0.12.2
  [2412ca09] ParameterHandling v0.3.1
  [e88e6eb3] Zygote v0.6.4
  [37e2e46d] LinearAlgebra

Should `unflatten` unwrap `Fixed` values?

I am finally getting around to #30. In doing so I noticed:

julia> x = (1, 2.0, fixed(3.0))
(1, 2.0, ParameterHandling.Fixed{Float64}(3.0))

julia> v, unflatten = flatten(x);

julia> unflatten(v)
(1, 2.0, ParameterHandling.Fixed{Float64}(3.0))

Since wrapping something in Fixed is only done for the sake of the flatten/unflatten machinery, I wonder if it would be better for unflatten to return the unwrapped value of the Fixed number, since that's all end users will generally care about. Or I guess another question to ask is: Are there any situations where the end user would want the Fixed-wrapped value rather than just the value itself?

A vector, where all elements are bounded between 0 and 1, and sum(vector)=1.

Love the package.

I often have components of models where I need to parameterize a categorical probability distribution. I use the following trick:

σ(z::Real) = one(z) / (one(z) + exp(-z))
function unconstrained2probvec(v)
    l = length(v)
    remaining = 1.0
    o = zeros(l+1)
    for i in 1:l
        #log(l+1-i) is a "balancing" shift to ensure that a vector of zeros givens you [1/N ... 1/N]
        o[i] = remaining*σ(v[i] - log(l+1-i))
        remaining -= o[i]
    end
    o[end] = max(0.0,remaining)
    return o
end

This takes a vector of N-1 unbounded values and returns a vector of N values, bounded between 0 and 1, that sums to 1. When input is all zeros, the output is all 1/N.

I'm too stupid to figure out how to build this into your package, but if you think it would be sensible to do so, please consider this a feature request.

[Feature request] `bounded` with equality (ie `[low, high]` rather than `(low, high)`)

It would be great to add an option to bounded so that it supports interval constraints with equality.

Current behaviour:

julia> bounded(0., 0., 1.)
ERROR: ArgumentError: Value, 0.0, outside of specified bounds (0.0, 1.0).
Stacktrace:
 [1] bounded(val::Float64, lower_bound::Float64, upper_bound::Float64)
   @ ParameterHandling ~/.julia/packages/ParameterHandling/QaqXk/src/parameters_scalar.jl:54

Example code in README fails (on Zygote.gradient)

Hi all,

The autodiff part of the example code in the README fails. If the Zygote line is removed, then it runs fine, but with it, you get the following stack trace:

julia> training_results = Optim.optimize(
           objective ∘ unflatten,
           θ -> only(Zygote.gradient(objective ∘ unflatten, θ)),
           flat_initial_params,
           BFGS(
               alphaguess = Optim.LineSearches.InitialStatic(scaled=true),
               linesearch = Optim.LineSearches.BackTracking(),
           ),
           Optim.Options(show_trace = true);
           inplace=false,
       )

ERROR: MethodError: no method matching AbstractGPs.FiniteGP(::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, ::Float64)

Closest candidates are:
  AbstractGPs.FiniteGP(::AbstractGPs.AbstractGP, ::AbstractVector)
   @ AbstractGPs ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:19
  AbstractGPs.FiniteGP(::AbstractGPs.AbstractGP, ::AbstractVector, ::Real)
   @ AbstractGPs ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:19
  AbstractGPs.FiniteGP(::AbstractGPs.AbstractGP, ::AbstractVector, ::AbstractVector{<:Real})
   @ AbstractGPs ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:13
  ...

Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:101 [inlined]
  [2] _pullback(::Zygote.Context{false}, ::Type{AbstractGPs.FiniteGP}, ::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, ::Float64)
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:101
  [3] _apply(::Function, ::Vararg{Any})
    @ Core ./boot.jl:838
  [4] adjoint
    @ ~/.julia/packages/Zygote/YYT6v/src/lib/lib.jl:203 [inlined]
  [5] _pullback
    @ ~/.julia/packages/ZygoteRules/OgCVT/src/adjoint.jl:66 [inlined]
  [6] _pullback
    @ ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:32 [inlined]
  [7] _pullback(ctx::Zygote.Context{false}, f::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, args::Float64)
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
  [8] #rrule_via_ad#54
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/chainrules.jl:260 [inlined]
  [9] rrule_via_ad
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/chainrules.jl:248 [inlined]
 [10] #1659
    @ ./none:0 [inlined]
 [11] iterate
    @ ./generator.jl:47 [inlined]
 [12] collect(itr::Base.Generator{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, ChainRules.var"#1659#1664"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}}})
    @ Base ./array.jl:782
 [13] rrule(config::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::typeof(sum), f::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, xs::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}; dims::Function)
    @ ChainRules ~/.julia/packages/ChainRules/snrkz/src/rulesets/Base/mapreduce.jl:102
 [14] rrule
    @ ~/.julia/packages/ChainRules/snrkz/src/rulesets/Base/mapreduce.jl:76 [inlined]
 [15] rrule(config::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::typeof(mean), f::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, x::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}; dims::Function)
    @ ChainRules ~/.julia/packages/ChainRules/snrkz/src/rulesets/Statistics/statistics.jl:28
 [16] rrule
    @ ~/.julia/packages/ChainRules/snrkz/src/rulesets/Statistics/statistics.jl:21 [inlined]
 [17] chain_rrule
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/chainrules.jl:223 [inlined]
 [18] macro expansion
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:101 [inlined]
 [19] _pullback(::Zygote.Context{false}, ::typeof(mean), ::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, ::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:101
 [20] _pullback
    @ ~/.julia/packages/AbstractGPs/XejGR/src/abstract_gp.jl:48 [inlined]
 [21] _pullback(::Zygote.Context{false}, ::typeof(mean_and_cov), ::GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, ::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [22] _pullback
    @ ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:134 [inlined]
 [23] _pullback(ctx::Zygote.Context{false}, f::typeof(mean_and_cov), args::AbstractGPs.FiniteGP{GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [24] _pullback
    @ ~/.julia/packages/AbstractGPs/XejGR/src/finite_gp_projection.jl:307 [inlined]
 [25] _pullback(::Zygote.Context{false}, ::typeof(logpdf), ::AbstractGPs.FiniteGP{GP{AbstractGPs.ZeroMean{Float64}, KernelSum{Tuple{TransformedKernel{ScaledKernel{Matern52Kernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}, TransformedKernel{ScaledKernel{SqExponentialKernel{Distances.Euclidean}, Float64}, ScaleTransform{Float64}}}}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}}, ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [26] _pullback
    @ ./REPL[29]:4 [inlined]
 [27] _pullback(ctx::Zygote.Context{false}, f::typeof(objective), args::NamedTuple{(:k1, :k2, :noise_var), Tuple{NamedTuple{(:var, :precision), Tuple{Float64, Float64}}, NamedTuple{(:var, :precision), Tuple{Float64, Float64}}, Float64}})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [28] _pullback
    @ ./operators.jl:1034 [inlined]
 [29] _pullback
    @ ./operators.jl:1031 [inlined]
 [30] _pullback(::Zygote.Context{false}, ::Base.var"##_#97", ::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::ComposedFunction{typeof(objective), ComposedFunction{typeof(ParameterHandling.value), ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:k1, :k2, :noise_var), Tuple{NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}, ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [31] _apply(::Function, ::Vararg{Any})
    @ Core ./boot.jl:838
 [32] adjoint
    @ ~/.julia/packages/Zygote/YYT6v/src/lib/lib.jl:203 [inlined]
 [33] _pullback
    @ ~/.julia/packages/ZygoteRules/OgCVT/src/adjoint.jl:66 [inlined]
 [34] _pullback
    @ ./operators.jl:1031 [inlined]
 [35] _pullback(ctx::Zygote.Context{false}, f::ComposedFunction{typeof(objective), ComposedFunction{typeof(ParameterHandling.value), ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:k1, :k2, :noise_var), Tuple{NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface2.jl:0
 [36] pullback(f::Function, cx::Zygote.Context{false}, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface.jl:44
 [37] pullback
    @ ~/.julia/packages/Zygote/YYT6v/src/compiler/interface.jl:42 [inlined]
 [38] gradient(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/YYT6v/src/compiler/interface.jl:96
 [39] (::var"#3#4")(θ::Vector{Float64})
    @ Main ./REPL[30]:9
 [40] (::NLSolversBase.var"#gg!#2"{var"#3#4"})(G::Vector{Float64}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/objective_types/inplace_factory.jl:21
 [41] (::NLSolversBase.var"#fg!#8"{ComposedFunction{typeof(objective), ComposedFunction{typeof(ParameterHandling.value), ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:k1, :k2, :noise_var), Tuple{NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_to_NamedTuple#17"{Float64, NamedTuple{(:var, :precision), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#15"{Float64, Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}, ParameterHandling.var"#unflatten_Positive#25"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}, NLSolversBase.var"#gg!#2"{var"#3#4"}})(gx::Vector{Float64}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/objective_types/abstract.jl:13
 [42] value_gradient!!(obj::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/interface.jl:82
 [43] initial_state(method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.BackTracking{Float64, Int64}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, Nothing}, d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, initial_x::Vector{Float64})
    @ Optim ~/.julia/packages/Optim/V8ZEC/src/multivariate/solvers/first_order/bfgs.jl:94
 [44] optimize
    @ ~/.julia/packages/Optim/V8ZEC/src/multivariate/optimize/optimize.jl:36 [inlined]
 [45] optimize(f::Function, g::Function, initial_x::Vector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.BackTracking{Float64, Int64}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, Nothing}; inplace::Bool, autodiff::Symbol)
    @ Optim ~/.julia/packages/Optim/V8ZEC/src/multivariate/optimize/interface.jl:156
 [46] top-level scope
    @ REPL[30]:7

Version info:

julia> versioninfo()
Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, tigerlake)
  Threads: 1 on 8 virtual cores

RFC: file structure

Currently, the code is split across four files:

  • ParameterHandling.jl (18 lines): imports, exports, includes
  • flatten.jl (127 lines): the base functionality (discussion whether to move this into a separate package in #43)
  • test_utils.jl (95 lines): interface tests
  • parameters.jl (247 lines, more than the other three together): everything else
    • AbstractParameter and value(),
    • value() for base types,
    • positive,
    • bounded,
    • fixed,
    • deferred,
    • orthogonal,
    • positive_definite

Following the discussion around positive_definite in #41, it seems that we might turn this into several: lower_triangular (no constraint on diagonal), lower_triangular_positive_diag (positive diagonal), and positive_definite for returning a PDMat.

Before making parameters.jl even longer (I already find it to be at if not above the upper end of how large a file can be to still keep a good overview of disparate content), I was wondering if it might make sense to split up parameters.jl - there's of course plenty of options, here's the ones I can see immediately:

A: one file per "item", i.e. perhaps base.jl for AbstractParameter and value() for base types, then positive.jl, bounded.jl, fixed.jl, etc.

B: grouped into less files, perhaps as follows:

  • base.jl / interface.jl / something like that: AbstractParameter and value() for base types, perhaps also including fixed and deferred, or keeping the latter two separate
  • scalar.jl: positive, bounded
  • matrix.jl: orthogonal, positive_definite &c.

C: status quo.

It would also be an option (composable with any of the above) to move some of the base definitions, e.g. AbstractParameter, function value end, into ParameterHandling.jl.

What would be your preference? Maybe another suggestion?

Unflatten - Implementing views instead of new vectors

From: #39

This only relates to array valued parameter. At the moment, when a Vector is unflattened, a new Vector is created for each argument in the tuple:

function flatten(::Type{T}, x::Tuple) where {T<:Real}
    x_vecs_and_backs = map(val -> flatten(T, val), x)
    x_vecs, x_backs = first.(x_vecs_and_backs), last.(x_vecs_and_backs)
    lengths = map(length, x_vecs)
    sz = _cumsum(lengths)
    function unflatten_to_Tuple(v::AbstractVector{<:Real})
        map(x_backs, lengths, sz) do x_back, l, s
            return x_back(v[(s - l + 1):s]) #HERE
        end
    end
    return reduce(vcat, x_vecs), unflatten_to_Tuple
end

This is necessary, as otherwise the unflattened NamedTuple would contain a bunch of Subarrays, as we cannot deduce the original array from

function flatten(::Type{T}, x::NamedTuple{names}) where {T<:Real,names}
    x_vec, unflatten = flatten(T, values(x))
    function unflatten_to_NamedTuple(v::AbstractVector{<:Real})
        v_vec_vec = unflatten(v)
        return NamedTuple{names}(v_vec_vec) #HERE
    end
    return x_vec, unflatten_to_NamedTuple
end

Changing return NamedTuple{names}(v_vec_vec) to typeof(x)(v_vec_vec) would allow us to use views in the first code block, but this change would make many AD backends unusable.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Why constructor instead of `identity`

I was wondering why a constructor is used for the unflatten function in

flatten(::Type{T}, x::Vector{R}) where {T<:Real,R<:Real} = (Vector{T}(x), Vector{R})

instead of simply the identity.

Context: I was chasing an Enzyme error. The following doesn't work:

using Enzyme, ParameterHandling
testfun(θ) = sum(θ[1])
θ = (rand(10),)
x, unpack = value_flatten(θ)
dx = zero(x)
autodiff(Reverse, testfun  unpack, Duplicated(x, dx)) # ERROR: Duplicated Returns not yet handled

But when overloading flatten for vectors as follows, it works ok.

ParameterHandling.flatten(::Type{T}, x::Vector{R}) where {T<:Real,R<:Real} = (Vector{T}(x), identity)

There's probably a bug somewhere in Enzyme which makes it so it can't differentiate this,, but as always I find it pretty hard to reduce to a MWE.

allow for `fixed(positive(1.0))` - apply `value` recursively?

To allow for specifying both that a parameter should be positive, and is currently not to be optimised over, it'd be great to support fixed(positive(1.0)). This could be achieved by applying value recursively (or at least twice, perhaps special-casing for fixed?).

Happy to provide a PR if you'd be willing to include it.

CI

Our CI seems to have stopped running. Looking at the CI file, it looks like it's very outdated, and could do with an overhaul.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.