Code Monkey home page Code Monkey logo

Comments (15)

kahaaga avatar kahaaga commented on May 27, 2024 1

Current behaviour

For now, generating motifs (i.e. the permutations that sort state vector elements) is done using the built-in sortperm!, which, when using default arguments to sortperm!, takes the first elements as the smaller one. I am not sure what would happen if other sorting algorithms were used - I would have to look deeper into the sorting code to see what happens here.

julia> x = [1, 2, 1, 2, 3, 2, 3];

julia> sortperm(x)
7-element Array{Int64,1}:
 1
 3
 2
 4
 6
 5
 7

Allowing random assignment of largest/smallest: proposal

However, the full method signature for sortperm! is

sortperm!(ix, v; alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward, initialized::Bool=false)

Conveniently, it possible to provide a custom less-than function by providing an alternative function to the lt keyword argument, which defaults to isless.

@Datseris For the best flexibility for the user, why not introduce ls as a field for the estimator(s)? That way, the user can provide their own logic for deciding what to do with equal elements. We could for example suggest in the docstring to use isless for the current behavior, and maybe something like isless_rand, as defined below, for randomly assigning smallest/largest for equal elements?

function isless_rand(a, b)
    if a  b
        rand(Bool)
    elseif a < b
        true
    else
        false
    end
end

This function could be pre-defined by us (but not exported or in the docs). Then we can just mention in the docstring that the user has to do SymbolicPermutation(lt=Entropies.isless_rand) when initializing the estimator.

References

It would be nice to have the reference that makes the claim that random assignment is better than just taking the first. I guess which approach is best would depend on the application too, so perhaps there is more than one paper that needs to be cited here.

from complexitymeasures.jl.

Datseris avatar Datseris commented on May 27, 2024

yes, making lt a field of the Symbolic estimators sounds like the perfect plan, and I would make the isless_rand the default version, but exchange the isapprox with ==.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

I forgot to attach it in the comment above, but this is an example of what you get if you use isless_rand on the same vector as in the previous example:

julia> x = [1, 2, 1, 2, 3, 2, 3];

julia> sortperm(x, lt=isless_rand)
7-element Array{Int64,1}:
 1
 3
 2
 4
 6
 7
 5

For this particular sorting, the sorting indices for the 3s are reversed relative to the default behaviour. So this should work perfectly.

yes, making lt a field of the Symbolic estimators sounds like the perfect plan, I would make the isless_rand the default version.

Agreed. I'll make a PR with the change.

but exchange the isapprox with ==.

Agreed. Approximate comparisons are slower too, so no reason to use them here.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Hm, this is more complicated than I though. Using sortperm(x, lt=isless_rand), works in some cases.

However, when running the example permutation entropy code in the documentation, Julia without any error warning when calling sortperm(x, lt=isless_rand). Changing it to sortperm(x, lt=Base.isless) works perfectly.

There must be some underlying criterion that the lt function must fulfill that I can't find in the Julia docs. I'll continue troubleshooting.

EDIT: fixed the same call twice.

from complexitymeasures.jl.

Datseris avatar Datseris commented on May 27, 2024

Julia without any error warning when calling sortperm(x, lt=isless_rand). Changing it to sortperm(x, lt=isless_rand) works perfectly.

? you used the same code twice

from complexitymeasures.jl.

Datseris avatar Datseris commented on May 27, 2024

Does the isless_rand work on static vectors? what's the input? is it always a timeseries?

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

sort! calls Base.ord, so this may be related to

    ord(lt, by, rev::Union{Bool, Nothing}, order::Ordering=Forward)
Construct an [`Ordering`](@ref) object from the same arguments used by
[`sort!`](@ref).
Elements are first transformed by the function `by` (which may be
[`identity`](@ref)) and are then compared according to either the function `lt`
or an existing ordering `order`. `lt` should be [`isless`](@ref) or a function
which obeys similar rules. Finally, the resulting order is reversed if
`rev=true`.

lt should be isless or a function which obeys similar rules. So what are the rules?

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Does the isless_rand work on static vectors? what's the input? is it always a timeseries?

Where the code fails, the input is a Vector{Int}.

? you used the same code twice

Ah, sorry! I updated the comment.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Ok, so these are the criteria

    isless(x, y)

Test whether `x` is less than `y`, according to a fixed total order.
`isless` is not defined on all pairs of values `(x, y)`. However, if it
is defined, it is expected to satisfy the following:
- If `isless(x, y)` is defined, then so is `isless(y, x)` and `isequal(x, y)`,
  and exactly one of those three yields `true`.
- The relation defined by `isless` is transitive, i.e.,
  `isless(x, y) && isless(y, z)` implies `isless(x, z)`.

from complexitymeasures.jl.

Datseris avatar Datseris commented on May 27, 2024

Ah okay, then our version clearly doesn't satisfy this, right?

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Does the isless_rand work on static vectors? what's the input? is it always a timeseries?

Where the code fails, the input is a Vector{Int}.

Ah, sorry! This was a brain fart from my side. I was applying the isless_rand in the wrong place. It should be applied to each state vector when symbolizing (i.e. sort vector -> assign integer to it), not when sorting the final symbolized time series (which is a Vector{Int}) for computing the probabilites (which of course takes place after symbolization has already occurred).

Your hint about the static vectors solved it! 🥳

I think... just need to re-generate the docs and check.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

... yes! It works. Thanks again for the hint, @Datseris!

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Ah okay, then our version clearly doesn't satisfy this, right?

I think you're right. The transitivity condition is not fulfilled because either condition in isless_rand(a, b) && isless_rand(b, c) && isless_rand(a, c) may be either true/false if a=b=c.

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Using lt=isless_rand seems to work in practice when symbolizing the state vectors, though, and behaves as expected on the inputs I can come up with to test it with, so I'm not sure this is a problem. I'm a bit uneasy about using something that I don't understand why works, but if it works, it works.

Maybe implement and leave this open for future discussion once we understand the issue better?

from complexitymeasures.jl.

kahaaga avatar kahaaga commented on May 27, 2024

Maybe implement and leave this open for future discussion once we understand the issue better?

See discussion in #40. Tests confirm that behaviour is as expected when using isless_rand, so this got merged. Since #40 is now merged, I'm closing this.

from complexitymeasures.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.