Comments (10)
Actually, this is breaking. Since the constructors for all the ordinal pattern outcome spaces now have default values for their positional arguments, we can't simply deprecate the constructor signatures, because the positional symbols will collide with the keyword symbols. Was there any particular reason we did we switched to positional arguments? I can't remember if we discussed it or if it was a silent change.
I don't think too many people have upgraded yet, so perhaps we should handle this is an almost-semver spirit, where if the user provides τ
or lt
as keywords to any of the constructors, they are simply warned that the keyword argument they provided will override the positional argument. This is strictly breaking, but with the warning, it should be easy enough to fix, so maybe we can justify not going for 4.0 just for this minor semver violation.
I'll submit a PR with my suggestion
from complexitymeasures.jl.
The other alternative is to revert to using keyword arguments, which is much cleaner IMO.
from complexitymeasures.jl.
I don't understand the problem. We can easily deprecate this. We need the following methods:
OP{m}(; τ, less) # the deprecated one. throws warning.
OP{m}(τ, less = rand_less) # the normal one.
once the deprecated one is removed then we simply put a default value back to τ
in the 2nd function.
I don't like putting functions as keyword arguments because it is not clear whether type specialization will occur on keyword arugments. This may have performance impact. If you can benchmark it and show that it doesn't, then ok for keyword arguments.
from complexitymeasures.jl.
The issue is that OP{m}(τ, less = rand_less)
is not the method we released for 3.0. We released OP{m}(τ = 1, ls = rand_less)
, which also has a default value for the first argument).
Now, if the user calls OP{m}(; τ = 1)
, the actual function call is OP{m}(τ = 1, less = rand_less; τ = 1)
. Which τ
is supposed to be used, the argument or the keyword argument? I may be missing something obvious here, but I don't see how we can deprecate like suggested above.
We'd have to do something like the following, where we make a choice about whether the argument or the keyword argument should take precedence.
function OrdinalPatterns{m}(τ::Int = 1, lt::F=isless_rand; kwargs...) where {m, F}
if haskey(kwargs, :τ)
msg = "Keyword argument `τ` to `OrdinalPatterns` is deprecated. " *
"The signature is now " *
"`OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)`" *
", so provide `τ` as a positional argument instead. " *
"In this call, the given keyword `τ` is used instead of the positional `τ`."
@warn msg
τ = kwargs[:τ]
end
if haskey(kwargs, :lt)
msg = "Keyword argument `lt` to `OrdinalPatterns` is deprecated. " *
"The signature is now " *
"`OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)`" *
", so provide `lt` as a positional argument instead. " *
"In this call, the given keyword `lt` is used instead of the positional `lt`."
@warn msg
lt = kwargs[:lt]
end
m >= 2 || throw(ArgumentError("Need order m ≥ 2."))
return OrdinalPatterns{m, F}(OrdinalPatternEncoding{m}(lt), τ)
end
from complexitymeasures.jl.
The issue is that
OP{m}(τ, less = rand_less)
is not the method we released for 3.0. We releasedOP{m}(τ = 1, ls = rand_less)
, which also has a default value for the first argument).
well okay, that was a bug so it isn't a breaking change to fix it. The documentation dictates what is breaking or not, not if we made a mistake in the source code.
Now, if the user calls
OP{m}(; τ = 1)
,
According to my comment If the user calls that they get the warning of the deprecated method, and then the method OP(τ)
is called, assuming we fix the source code, so thereisn't any problem.
But in any case, the code you paste above solves all the problems as well.
from complexitymeasures.jl.
I don't like putting functions as keyword arguments because it is not clear whether type specialization will occur on keyword arugments. This may have performance impact.
Just because I'm curious and want to make sure I haven't missed anything: Where is it that you're worried about the performance impact, specifically? This is the OrdinalPatterns
struct:
struct OrdinalPatterns{M,F} <: OrdinalOutcomeSpace{M}
encoding::OrdinalPatternEncoding{M,F}
τ::Int
end
The constructor is
function OrdinalPatterns{m}(τ::Int = 1, lt::F=isless_rand; kwargs...) where {m, F}
m >= 2 || throw(ArgumentError("Need order m ≥ 2."))
return OrdinalPatterns{m, F}(OrdinalPatternEncoding{m}(lt), τ)
end
which has a function barrier in place inside the constructor that ensures specialization for the encoding
field. At what stage here do we potentially introduce performance drop by using keyword arguments for the outer constructor?
EDIT: I'll do a small benchmark too, just to check if there is any difference
from complexitymeasures.jl.
The issue is that
OP{m}(τ, less = rand_less)
is not the method we released for 3.0. We releasedOP{m}(τ = 1, ls = rand_less)
, which also has a default value for the first argument).well okay, that was a bug so it isn't a breaking change to fix it. The documentation dictates what is breaking or not, not if we made a mistake in the source code.
The docstring is currently:
"""
OrdinalPatterns <: OutcomeSpace
OrdinalPatterns{m}(τ = 1, lt::Function = ComplexityMeasures.isless_rand)
An [`OutcomeSpace`](@ref) based on lengh-`m` ordinal permutation patterns, originally
introduced in [BandtPompe2002](@citet)'s paper on permutation entropy.
Note that `m` is given as a type parameter, so that when it is a literal integer
there are performance accelerations.
...
"""
This is a breaking change UNLESS we use my solution above and let the keyword argument take precedence. Do we agree?
from complexitymeasures.jl.
your solution works yes.
At what stage here do we potentially introduce performance drop by using keyword arguments for the outer constructor?
I am not sure. Perhaps I am saying nonsense and this doesn't matter at all. But we haven't checked for it either. The case I would check is a for
loop where the OP is instantiated inside the for loop. Not sure what scenario this would be but let's imagine a user code
for x in thousand_timeseries
OP = OrPar(; ...)
entropy(OP, x)
end
that's the case I would test.
from complexitymeasures.jl.
Some setup:
# Identical structs, but we use a positional argument outer constructor
# for OP1 and a keyword argument constructor for OP2
struct OP1{M,F, I <: Integer} <: OrdinalOutcomeSpace{M}
encoding::OrdinalPatternEncoding{M,F}
τ::I
end
struct OP2{M,F, I <: Integer} <: OrdinalOutcomeSpace{M}
encoding::OrdinalPatternEncoding{M,F}
τ::I
end
function OP1{m}(τ::I = 1, lt::F=isless_rand) where {m, F, I}
return OP1{m, F, I}(OrdinalPatternEncoding{m}(lt), τ)
end
function OP2{m}(; τ::I = 1, lt::F=isless_rand) where {m, F, I}
return OP2{m, F, I}(OrdinalPatternEncoding{m}(lt), τ)
end
# So that we can call `probabilities`
permutation_weights(::OP1, ::Any) = nothing
permutation_weights(::OP2, ::Any) = nothing
function test_OP1(τs, funcs)
for (τ, lt) in zip(τs, funcs)
OP1{3}(τ, lt)
end
end
function test_OP2(τs, funcs)
for (τ, lt) in zip(τs, funcs)
OP2{3}(; τ, lt)
end
end
function test_OP1_do_stuff(τs, funcs, x)
for i in eachindex(τs)
o = OP1{3}(τs[i], funcs[i])
probabilities(o, x)
end
end
function test_OP2_do_stuff(τs, funcs, x)
for i in eachindex(τs)
o = OP2{3}(; τ = τs[i], lt = funcs[i])
probabilities(o, x)
end
end
# Setup and precompilation
using Random;
rng = MersenneTwister(1234);
n = 1000
τs = rand(1:50, n);
fs = rand([Base.isless, ComplexityMeasures.isless_rand], n);
x = rand(1000);
test_OP1(τs, fs);
test_OP2(τs, fs);
test_OP1_do_stuff(τs, fs, x);
test_OP2_do_stuff(τs, fs, x);
The initialization of the struct is indeed slower for the keyword constructor than for the positional argument constructor:
julia> @btime test_OP1($τs, $fs)
76.666 μs (2000 allocations: 62.50 KiB)
julia> @btime test_OP2($τs, $fs)
199.000 μs (4000 allocations: 93.75 KiB)
But this overhead is completely irrelevant when actually performing any kind of operation with the outcome space:
julia> @btime test_OP1_do_stuff($τs, $fs, $x)
72.398 ms (168000 allocations: 44.44 MiB)
julia> @btime test_OP2_do_stuff($τs, $fs, $x)
72.870 ms (170000 allocations: 44.47 MiB)
For what we do in this package, with a reasonable-length time series (here: 1000 points) the actual calculations take three orders of magnitude longer than constructing the type instances. The difference would be even larger when also computing something like entropies on top of the probabilities.
Conclusion: we don't have to worry about keyword arguments in the constructors, except when the work done with the constructed types is minimal. Then some optimization on positional arguments could be relevant.
from complexitymeasures.jl.
I am not sure. Perhaps I am saying nonsense and this doesn't matter at all.
You're obviously right. But I wonder why? I can't find anything concrete in the Julia docs about it
from complexitymeasures.jl.
Related Issues (20)
- ```genentropy``` is broken HOT 1
- Docstring and implementation for Statistical Complexity is wrong HOT 1
- `eachindex` for `Probabilities` is ambiguous HOT 3
- Signature for `Counts` and `Probabilities` docstrings has the wrong type parameter order HOT 3
- Some documentation issues for CI HOT 2
- The function `lt` in `OrdinalPatternEncoding` isn't actually used HOT 1
- Reproducibility for `OrdinalPatternEncoding` HOT 3
- It shouldn't be possible to construct an empty `CombinationEncoding` HOT 3
- Feature: "distribution entropy" HOT 3
- Feature: bubble entropy (description is WIP) HOT 4
- Feature: "increment entropy" HOT 1
- Feature: "attention entropy"
- `missing_probabilities` HOT 1
- `counts_and_outcomes` for `BubbleSortSwaps` should also accept state space sets
- Syntax with type parameter `{m}` in `OrdinalPatterns` is not harmonious with the rest of the library HOT 10
- Encoding using `Dispersion` is slower than necessary due to manual integration for normal cdf
- Encoding complex-valued data HOT 2
- [Q] How to calculate MI between two vectors? HOT 3
- Latest stable documentation has an error in the `StatisticalComplexity` docstring HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from complexitymeasures.jl.