Code Monkey home page Code Monkey logo

Comments (10)

kahaaga avatar kahaaga commented on July 24, 2024

Actually, this is breaking. Since the constructors for all the ordinal pattern outcome spaces now have default values for their positional arguments, we can't simply deprecate the constructor signatures, because the positional symbols will collide with the keyword symbols. Was there any particular reason we did we switched to positional arguments? I can't remember if we discussed it or if it was a silent change.

I don't think too many people have upgraded yet, so perhaps we should handle this is an almost-semver spirit, where if the user provides τ or lt as keywords to any of the constructors, they are simply warned that the keyword argument they provided will override the positional argument. This is strictly breaking, but with the warning, it should be easy enough to fix, so maybe we can justify not going for 4.0 just for this minor semver violation.

I'll submit a PR with my suggestion

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

The other alternative is to revert to using keyword arguments, which is much cleaner IMO.

from complexitymeasures.jl.

Datseris avatar Datseris commented on July 24, 2024

I don't understand the problem. We can easily deprecate this. We need the following methods:

OP{m}(; τ, less) # the deprecated one. throws warning.
OP{m}(τ, less = rand_less) # the normal one.

once the deprecated one is removed then we simply put a default value back to τ in the 2nd function.

I don't like putting functions as keyword arguments because it is not clear whether type specialization will occur on keyword arugments. This may have performance impact. If you can benchmark it and show that it doesn't, then ok for keyword arguments.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

The issue is that OP{m}(τ, less = rand_less) is not the method we released for 3.0. We released OP{m}(τ = 1, ls = rand_less) , which also has a default value for the first argument).

Now, if the user calls OP{m}(; τ = 1), the actual function call is OP{m}(τ = 1, less = rand_less; τ = 1). Which τ is supposed to be used, the argument or the keyword argument? I may be missing something obvious here, but I don't see how we can deprecate like suggested above.

We'd have to do something like the following, where we make a choice about whether the argument or the keyword argument should take precedence.

function OrdinalPatterns{m}::Int = 1, lt::F=isless_rand; kwargs...) where {m, F}
    if haskey(kwargs, )
        msg = "Keyword argument `τ` to `OrdinalPatterns` is deprecated. " *
        "The signature is now " * 
        "`OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)`" * 
        ", so provide `τ` as a positional argument instead. " * 
        "In this call, the given keyword `τ` is used instead of the positional `τ`."
        @warn msg
        τ = kwargs[]
    end
    if haskey(kwargs, :lt)
        msg = "Keyword argument `lt` to `OrdinalPatterns` is deprecated. " *
        "The signature is now " * 
        "`OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)`" * 
        ", so provide `lt` as a positional argument instead. "  * 
        "In this call, the given keyword `lt` is used instead of the positional `lt`."
        @warn msg
        lt = kwargs[:lt]
    end

    m >= 2 || throw(ArgumentError("Need order m ≥ 2."))
    return OrdinalPatterns{m, F}(OrdinalPatternEncoding{m}(lt), τ)
end

from complexitymeasures.jl.

Datseris avatar Datseris commented on July 24, 2024

The issue is that OP{m}(τ, less = rand_less) is not the method we released for 3.0. We released OP{m}(τ = 1, ls = rand_less) , which also has a default value for the first argument).

well okay, that was a bug so it isn't a breaking change to fix it. The documentation dictates what is breaking or not, not if we made a mistake in the source code.

Now, if the user calls OP{m}(; τ = 1),

According to my comment If the user calls that they get the warning of the deprecated method, and then the method OP(τ) is called, assuming we fix the source code, so thereisn't any problem.

But in any case, the code you paste above solves all the problems as well.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

I don't like putting functions as keyword arguments because it is not clear whether type specialization will occur on keyword arugments. This may have performance impact.

Just because I'm curious and want to make sure I haven't missed anything: Where is it that you're worried about the performance impact, specifically? This is the OrdinalPatterns struct:

struct OrdinalPatterns{M,F} <: OrdinalOutcomeSpace{M}
    encoding::OrdinalPatternEncoding{M,F}
    τ::Int
end

The constructor is

function OrdinalPatterns{m}::Int = 1, lt::F=isless_rand; kwargs...) where {m, F}
    m >= 2 || throw(ArgumentError("Need order m ≥ 2."))
    return OrdinalPatterns{m, F}(OrdinalPatternEncoding{m}(lt), τ)
end

which has a function barrier in place inside the constructor that ensures specialization for the encoding field. At what stage here do we potentially introduce performance drop by using keyword arguments for the outer constructor?

EDIT: I'll do a small benchmark too, just to check if there is any difference

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

The issue is that OP{m}(τ, less = rand_less) is not the method we released for 3.0. We released OP{m}(τ = 1, ls = rand_less) , which also has a default value for the first argument).

well okay, that was a bug so it isn't a breaking change to fix it. The documentation dictates what is breaking or not, not if we made a mistake in the source code.

The docstring is currently:

"""
    OrdinalPatterns <: OutcomeSpace
    OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)

An [`OutcomeSpace`](@ref) based on lengh-`m` ordinal permutation patterns, originally
introduced in [BandtPompe2002](@citet)'s paper on permutation entropy.
Note that `m` is given as a type parameter, so that when it is a literal integer
there are performance accelerations.
... 
"""

This is a breaking change UNLESS we use my solution above and let the keyword argument take precedence. Do we agree?

from complexitymeasures.jl.

Datseris avatar Datseris commented on July 24, 2024

your solution works yes.

At what stage here do we potentially introduce performance drop by using keyword arguments for the outer constructor?

I am not sure. Perhaps I am saying nonsense and this doesn't matter at all. But we haven't checked for it either. The case I would check is a for loop where the OP is instantiated inside the for loop. Not sure what scenario this would be but let's imagine a user code

for x in thousand_timeseries
OP = OrPar(; ...)
entropy(OP, x)
end

that's the case I would test.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

Some setup:

# Identical structs, but we use a positional argument outer constructor
#  for OP1 and a keyword argument constructor for OP2
struct OP1{M,F, I <: Integer} <: OrdinalOutcomeSpace{M}
    encoding::OrdinalPatternEncoding{M,F}
    τ::I
end
struct OP2{M,F, I <: Integer} <: OrdinalOutcomeSpace{M}
    encoding::OrdinalPatternEncoding{M,F}
    τ::I
end

function OP1{m}::I = 1, lt::F=isless_rand) where {m, F, I}
    return OP1{m, F, I}(OrdinalPatternEncoding{m}(lt), τ)
end
function OP2{m}(; τ::I = 1, lt::F=isless_rand) where {m, F, I}
    return OP2{m, F, I}(OrdinalPatternEncoding{m}(lt), τ)
end

# So that we can call `probabilities`
permutation_weights(::OP1, ::Any) = nothing
permutation_weights(::OP2, ::Any) = nothing

function test_OP1(τs, funcs)
    for (τ, lt) in zip(τs, funcs)
        OP1{3}(τ, lt)
    end
end

function test_OP2(τs, funcs)
    for (τ, lt) in zip(τs, funcs)
        OP2{3}(; τ, lt)
    end
end

function test_OP1_do_stuff(τs, funcs, x)
    for i in eachindex(τs)
        o = OP1{3}(τs[i], funcs[i])
        probabilities(o, x)
    end
end

function test_OP2_do_stuff(τs, funcs, x)
    for i in eachindex(τs)
        o = OP2{3}(; τ = τs[i], lt = funcs[i])
        probabilities(o, x)
    end
end

# Setup and precompilation
using Random; 
rng = MersenneTwister(1234);
n = 1000
τs = rand(1:50, n);
fs = rand([Base.isless, ComplexityMeasures.isless_rand], n);
x = rand(1000);

test_OP1(τs, fs);
test_OP2(τs, fs);
test_OP1_do_stuff(τs, fs, x);
test_OP2_do_stuff(τs, fs, x);

The initialization of the struct is indeed slower for the keyword constructor than for the positional argument constructor:

julia> @btime test_OP1($τs, $fs)

  76.666 μs (2000 allocations: 62.50 KiB)

julia> @btime test_OP2($τs, $fs)
  199.000 μs (4000 allocations: 93.75 KiB)

But this overhead is completely irrelevant when actually performing any kind of operation with the outcome space:

julia> @btime test_OP1_do_stuff($τs, $fs, $x)

  72.398 ms (168000 allocations: 44.44 MiB)

julia> @btime test_OP2_do_stuff($τs, $fs, $x)

  72.870 ms (170000 allocations: 44.47 MiB)

For what we do in this package, with a reasonable-length time series (here: 1000 points) the actual calculations take three orders of magnitude longer than constructing the type instances. The difference would be even larger when also computing something like entropies on top of the probabilities.

Conclusion: we don't have to worry about keyword arguments in the constructors, except when the work done with the constructed types is minimal. Then some optimization on positional arguments could be relevant.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on July 24, 2024

I am not sure. Perhaps I am saying nonsense and this doesn't matter at all.

You're obviously right. But I wonder why? I can't find anything concrete in the Julia docs about it

from complexitymeasures.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.