mcabbott / axiskeys.jl Goto Github PK

View Code? Open in Web Editor NEW

147.0 147.0 27.0 2.1 MB

🎹

License: MIT License

Julia 100.00%

axiskeys.jl's People

Contributors

Stargazers

Watchers

axiskeys.jl's Issues

Feature request: Iterators.product

It would be nice for Iterators.product applied to KeyedArrays to appropriately carry forward axis information. I am happy to make a pull request for this if there is interest. One design decision is how to handle the case where a KeyedArray is "producted" with a standard Array. I think dropping the keys in this case would make sense.

Add `rekey` method

It's kind of annoying needing to reconstruct a KeyedArray every time you want to replace the keys in an array. I was thinking that we could do something similar to rename:

rekey(ka, (key1, key2, ...))
rekey(ka, dim1 => key1)

Maybe also:

rekey(ka, dim1 => (newdim, key1))

To do a rename and rekey in one operation?

NOTE: We're already doing a version of this in AxisSets.jl, but it'd be nice if we could just extend a method here.

LinAlg between Triangular and KeyedArray is ambiguous

NamedDimsArray works nicely with triangular matrices but KeyedArray gives an error:

julia> using AxisKeys, LinearAlgebra, Statistics
julia> data = rand(3, 5);
julia> KA = KeyedArray(data, features=[:a, :b, :c], time=1:5);
julia> U = cholesky(cov(data')).U;
julia> U \ KA.data
3×5 NamedDimsArray(::Matrix{Float64}, (:features, :time)):
            → time
↓ features  -4.95547  -4.3829    -3.22884  -2.74354  -2.12881
             2.49055   0.510476   2.61172   1.07843   0.476692
             7.43526   7.28534    4.5931    5.92775   2.64983
julia> U \ KA
ERROR: MethodError: \(::UpperTriangular{Float64, Matrix{Float64}}, ::KeyedArray{Float64, 2, NamedDimsArray{(:features, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{Symbol}, UnitRange{Int64}}}) is ambiguous. Candidates:
  \(A::Union{LowerTriangular, UpperTriangular}, B::AbstractMatrix{T} where T) in LinearAlgebra at /Applications/Julia-1.6.app/Contents/Resources/julia/share/julia/stdlib/v1.6/LinearAlgebra/src/triangular.jl:1668
  \(x::AbstractMatrix{T} where T, y::KeyedArray{T, 2, AT, RT} where {T, AT, RT}) in AxisKeys at /Users/molet/.julia/packages/AxisKeys/VS8DQ/src/functions.jl:302
Possible fix, define
  \(::Union{LowerTriangular, UpperTriangular}, ::KeyedArray{T, 2, AT, RT} where {T, AT, RT})
Stacktrace:
 [1] top-level scope
   @ REPL[435]:1

Difficulty writing into array (unsure if intentional)

Trying to write into an array gives an error when one of the arrays has a trailing singleton dimension:

julia> table(:, :total) .= sum(pointwise([:loo_score, :naive_score, :overfit]); dims=1)'
ERROR: DimensionMismatch("cannot broadcast array to have fewer dimensions")

Where table(:, :total) and sum(pointwise([:loo_score, :naive_score, :overfit]); dims=1)' return:

1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   criterion ∈ 3-element Vector{Symbol}
And data, 3-element view(::Matrix{Float64}, :, 1) with eltype Float64:
 (:loo_score)    2.5e-323
 (:naive_score)  5.0e-323
 (:overfit)      7.4e-323


2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   statistic ∈ 3-element Vector{Symbol}
→   data ∈ 1-element OneTo{Int}
And data, 3×1 adjoint(::Matrix{Float64}) with eltype Float64:
                  (1)
  (:loo_score)     69.38414329944844
  (:naive_score)  218.93276111448398
  (:overfit)      149.54861781503556

This can be left (which would be consistent with Base), or trailing singleton dimensions could be fixed for ease-of-use.

Feature request: column/row lookup with `.` syntax

I believe defining a getproperty method for KeyedArrays which looks up a column/row and then returns it should be enough for this to work.

No attribute when single key is selected

Hi, usually when I pick out a certain key I get the list of attributes. When a single key is used, however, this isn't the case:

using AxisKeys
data = rand(Int8, 3,10,3) .|> abs;
A = KeyedArray(data; channel=[:MEG1, :MEG2, :MEG3], time=range(13, step=2.5, length=10), iter=31:33)
test = A(channel=[:MEG1, :MEG2])
test.channel

This will give me

2-element view(::Array{Symbol,1}, [1, 2]) with eltype Symbol:
 :MEG1
 :MEG2

But in the case of a singular key index the attribute isn't generated

test_singular = A(channel=[:MEG1])

ERROR: type KeyedArray has no field channel
Stacktrace:
 [1] getproperty(::KeyedArray{Int8,2,NamedDimsArray{(:time, :iter),Int8,2,SubArray{Int8,2,Array{Int8,3},Tuple{Int64,Base.Slice{Base.OneTo{Int64}},Base.Slice{Base.OneTo{Int64}}},true}},Tuple{StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}},UnitRange{Int64}}}, ::Symbol) at C:\Users\asimd\.julia\packages\AxisKeys\TSTM2\src\names.jl:63

I feel that when you have a singular index, it would still be nice to have the name of the key/dimension as an attribute.

Axiskeys as a NamedTuple?

Is there a nice way to do the following:

julia> using AxisKeys

julia> arr = KeyedArray(randn(2,2), a=1:2, b=3:4);

julia> namedaxiskeys(arr) # is there such a function/how to do this?
(a=1:2, b=3:4)

indexed assignment does not work with dimension name specification

Indexing by dimension names does not seem to work for LHS assignments:

julia> K = wrapdims(rand(2,2), x=1:2, y=1:2);
julia> K[x=1] .= 0
ERROR: MethodError: no method matching dotview(::KeyedArray{Float64,2,NamedDimsArray{(:x, :y),Float64,2,Array{Float64,2}},Tuple{UnitRange{Int64},UnitRange{Int64}}}; x=1)
Closest candidates are:
  dotview(::KeyedArray, ::Any...) at /home/takbal/.julia/packages/AxisKeys/G3Okw/src/struct.jl:137 got unsupported keyword argument "x"
  dotview(::Any...) at broadcast.jl:1160 got unsupported keyword argument "x"
  dotview(::BitArray, ::BitArray) at broadcast.jl:1130 got unsupported keyword argument "x"
  ...
Stacktrace:
 [1] top-level scope at REPL[182]:1

It works for NamedDims though:

julia> parent(K)[x=1] .= 0
2-element NamedDimsArray(view(::Array{Float64,2}, 1, :), (:y,)):
↓ y  0.0
     0.0

Looks like the reason is that the Base.dotview() specialization does not pass keyword arguments upwards.

isequal errors when comparing arrays of different dimensions

I believe the problem is that isequal etc doesn't verify that the dimension of A and B are the same before calling unifiable_keys.

AxisKeys.jl/src/functions.jl

Lines 278 to 286 in cddf60f

    
           for fun in [:(==), :isequal, :isapprox] 
        
               for (T, S) in [ (:KeyedArray, :KeyedArray), (:KeyedArray, :NdaKa), (:NdaKa, :KeyedArray) ] 
        
                   @eval function Base.$fun(A::$T, B::$S; kw...) 
        
                       # Ideally you would pass isapprox(, atol) into unifiable_keys? 
        
                       unifiable_keys(axiskeys(A), axiskeys(B)) || return false 
        
                       return $fun(keyless(A), keyless(B); kw...) 
        
                   end 
        
               end 
        
           end

There should perhaps be some early exit condition in that case.

MWE

julia> a = KeyedArray(ones(3); x=["a", "b", "c"])
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   x ∈ 3-element Vector{String}
And data, 3-element Vector{Float64}:
 ("a")  1.0
 ("b")  1.0
 ("c")  1.0

julia> b = KeyedArray(ones(2, 2); x=["a", "b"], y=[1, 2])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   x ∈ 2-element Vector{String}
→   y ∈ 2-element Vector{Int64}
And data, 2×2 Matrix{Float64}:
         (1)    (2)
  ("a")    1.0    1.0
  ("b")    1.0    1.0

julia> c = KeyedArray(ones(3, 2); x=["a", "b", "c"], y=[1, 2])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   x ∈ 3-element Vector{String}
→   y ∈ 2-element Vector{Int64}
And data, 3×2 Matrix{Float64}:
         (1)    (2)
  ("a")    1.0    1.0
  ("b")    1.0    1.0
  ("c")    1.0    1.0

julia> b == c
false

julia> a == b
ERROR: BoundsError: attempt to access Tuple{} at index [1]
Stacktrace:
 [1] getindex(t::Tuple, i::Int64)
   @ Base ./tuple.jl:29
 [2] map (repeats 2 times)
   @ ./tuple.jl:236 [inlined]
 [3] unifiable_keys(left::Tuple{Vector{String}}, right::Tuple{Vector{String}, Vector{Int64}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/broadcast.jl:77
 [4] ==(A::KeyedArray{Float64, 1, NamedDimsArray{(:x,), Float64, 1, Vector{Float64}}, Base.RefValue{Vector{String}}}, B::KeyedArray{Float64, 2, NamedDimsArray{(:x, :y), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{Int64}}}; kw::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/functions.jl:282
 [5] ==(A::KeyedArray{Float64, 1, NamedDimsArray{(:x,), Float64, 1, Vector{Float64}}, Base.RefValue{Vector{String}}}, B::KeyedArray{Float64, 2, NamedDimsArray{(:x, :y), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{Int64}}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/functions.jl:282
 [6] top-level scope
   @ REPL[216]:1

julia> a == c
ERROR: BoundsError: attempt to access Tuple{} at index [1]
Stacktrace:
 [1] getindex(t::Tuple, i::Int64)
   @ Base ./tuple.jl:29
 [2] map (repeats 2 times)
   @ ./tuple.jl:236 [inlined]
 [3] unifiable_keys(left::Tuple{Vector{String}}, right::Tuple{Vector{String}, Vector{Int64}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/broadcast.jl:77
 [4] ==(A::KeyedArray{Float64, 1, NamedDimsArray{(:x,), Float64, 1, Vector{Float64}}, Base.RefValue{Vector{String}}}, B::KeyedArray{Float64, 2, NamedDimsArray{(:x, :y), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{Int64}}}; kw::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/functions.jl:282
 [5] ==(A::KeyedArray{Float64, 1, NamedDimsArray{(:x,), Float64, 1, Vector{Float64}}, Base.RefValue{Vector{String}}}, B::KeyedArray{Float64, 2, NamedDimsArray{(:x, :y), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{Int64}}})
   @ AxisKeys ~/.julia/packages/AxisKeys/G3Okw/src/functions.jl:282
 [6] top-level scope
   @ REPL[217]:1

issues with `show` on Julia 1.7-beta3 and AxisKeys v0.1.18

julia> KeyedArray(rand(10,10); channel = [ "a$i" for i = 1:10], time=1:10)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   channel ∈ 10-element Vector{String}
→   time ∈ 10-element UnitRange{Int64}
And data, 10×10 Matrix{Float64}:
┌ Warning: use values(kwargs) and keys(kwargs) instead of kwargs.data and kwargs.itr
│   caller = ShowWith at show.jl:139 [inlined]
└ @ Core ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:139
┌ Warning: use values(kwargs) and keys(kwargs) instead of kwargs.data and kwargs.itr
│   caller = ShowWith at show.jl:139 [inlined]
└ @ Core ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:139
┌ Warning: use values(kwargs) and keys(kwargs) instead of kwargs.data and kwargs.itr
│   caller = ShowWith at show.jl:139 [inlined]
└ @ Core ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:139
┌ Warning: use values(kwargs) and keys(kwargs) instead of kwargs.data and kwargs.itr
│   caller = ShowWith at show.jl:139 [inlined]
└ @ Core ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:139
 Error showing value of type KeyedArray{Float64, 2, NamedDimsArray{(:channel, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, UnitRange{Int64}}}:
ERROR: MethodError: no method matching ncodeunits(::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}})
Closest candidates are:
  ncodeunits(::SubString) at strings/substring.jl:64
  ncodeunits(::SubstitutionString) at regex.jl:548
  ncodeunits(::Char) at char.jl:65
  ...
Stacktrace:
  [1] show(io::IOContext{IOBuffer}, mime::MIME{Symbol("text/plain")}, str::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}}; limit::Nothing)
    @ Base ./strings/io.jl:209
  [2] show
    @ ./strings/io.jl:201 [inlined]
  [3] show(io::IOContext{IOBuffer}, m::String, x::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}})
    @ Base.Multimedia ./multimedia.jl:111
  [4] sprint(::Function, ::String, ::Vararg{Any}; context::IOContext{Base.TTY}, sizehint::Int64)
    @ Base ./strings/io.jl:110
  [5] print_matrix_row(io::IOContext{Base.TTY}, X::AbstractVecOrMat, A::Vector{Tuple{Int64, Int64}}, i::Int64, cols::Vector{Int64}, sep::String)
    @ Base ./arrayshow.jl:106
  [6] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base ./arrayshow.jl:232
  [7] print_matrix(io::IOContext{Base.TTY}, X::Matrix{Any}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64) (repeats 2 times)
    @ Base ./arrayshow.jl:169
  [8] keyed_print_matrix(io::IOContext{Base.TTY}, A::KeyedArray{Float64, 2, NamedDimsArray{(:channel, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, UnitRange{Int64}}}, reduce_size::Bool)
    @ AxisKeys ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:117
  [9] print_matrix
    @ ~/.julia/packages/AxisKeys/lOlZC/src/show.jl:73 [inlined]
 [10] print_array
    @ ./arrayshow.jl:355 [inlined]
 [11] show(io::IOContext{Base.TTY}, #unused#::MIME{Symbol("text/plain")}, X::KeyedArray{Float64, 2, NamedDimsArray{(:channel, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, UnitRange{Int64}}})
    @ Base ./arrayshow.jl:396
 [12] (::REPL.var"#43#44"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:261
 [13] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:505
 [14] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:254
 [15] display(d::REPL.REPLDisplay, x::Any)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:266
 [16] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:328
 [17] #invokelatest#2
    @ ./essentials.jl:716 [inlined]
 [18] invokelatest
    @ ./essentials.jl:714 [inlined]
 [19] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:288
 [20] (::REPL.var"#45#46"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:272
 [21] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:505
 [22] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:270
 [23] (::REPL.var"#do_respond#66"{Bool, Bool, REPL.var"#77#87"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:841
 [24] #invokelatest#2
    @ ./essentials.jl:716 [inlined]
 [25] invokelatest
    @ ./essentials.jl:714 [inlined]
 [26] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/LineEdit.jl:2493
 [27] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:1227
 [28] (::REPL.var"#49#54"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:411

Sorting

Just a thought: for 1D KeyedArray sort would seem to be a valid operation. Currently the keys are stripped (they just become a list of integers), but no reason we couldn't permute them in the sorted order of the values.

Could `setindex!` work this way?

a = KeyedArray(rand(2, 3), down=[1, 2], across=[1.0, 2.0, 3.0])
b = KeyedArray(rand(1, 3), down=[1], across=[1.0, 2.0, 3.0])

julia> a[down=1] += b[down=1]
ERROR: MethodError: no method matching setindex!(::KeyedArray{Float64, 2, NamedDimsArray{(:down, :across), Float64, 2, Matrix{Float64}}, Tuple{Vector{Int64}, Vector{Float64}}}, ::KeyedArray{Float64, 1, NamedDimsArray{(:across,), Float64, 1, Vector{Float64}}, Base.RefValue{Vector{Float64}}}; down=1)
Closest candidates are:
  setindex!(::KeyedArray, ::Any, ::Any...) at /Users/mzgubic/.julia/packages/AxisKeys/G3Okw/src/struct.jl:129 got unsupported keyword argument "down"
  setindex!(::AbstractArray, ::Any, ::Any...) at abstractarray.jl:1264 got unsupported keyword argument "down"
Stacktrace:
 [1] top-level scope
   @ REPL[23]:1

Support for GPU Arrays?

Is it possible to use AxisKeys.jl together with GPU arrays, e.g. those in KernelAbstractions.jl or CUDA.jl?

Add an `align` function?

I'll open this as an issue first in case it exists and I couldn't find it. I'm looking for a function like xarrays align to be able to align the keys of multiple KeyedArrays. If this doesn't exist in some form then I'm happy to work on an implementation.

http://xarray.pydata.org/en/stable/generated/xarray.align.html?highlight=align#xarray-align

Poor performance of `getkey` compared to docs benchmarks

I have rerun the first set of benchmarks from /docs/speed.jl comparing getindex to getkey. In many instances getkey performs worse (both relative to getindex and relative to the values in speed.jl comments). For getkey in many instances I saw substantially worse performance (> 1 order of magnitude) compared to the getindex calls on my machine, including some allocations where there seemed to be none for the results in the comments.

I have tested this on both 1.4.2 and 1.6.1. 1.6.1 gave me ~2x speedup over the 1.4.2 results, but the numbers were still far off of what's present in the comments for getkey. The environment I ran this in was just the AxisKeys project environment with the latest version of all dependencies + BenchmarkTools. Machine information is included at the end.

Here are my results of running the benchmarks. I included the original values, and then appended my results from @btime afterwards in parentheses.

using AxisKeys, BenchmarkTools # Julia 1.4.2

#==============================#
#===== getkey vs getindex =====#
#==============================#

mat = wrapdims(rand(3,4), 11:13, 21:24)
bothmat = wrapdims(mat.data, x=11:13, y=21:24)
bothmat2 = wrapdims(mat.data, x=collect(11:13), y=collect(21:24))

@btime $mat[3, 4]    # 1.699 ns (1.699 ns)
@btime $mat(13, 24)  # 5.312 ns (124.232 ns (1 allocation: 32 bytes))

@btime $bothmat[3,4]        # 1.700 ns (1.649 ns)
@btime $bothmat[x=3, y=4]   # 1.701 ns (1.965 ns)
@btime $bothmat(13, 24)     # 5.874 ns  (124.299 ns (1 allocation: 32 bytes))
@btime $bothmat(x=13, y=24) # 14.063 ns (132.624 ns (1 allocation: 32 bytes))
@btime $bothmat2(13, 24)    # 16.719 ns (13.606 ns)

ind_collect(A) = [@inbounds(A[ijk...]) for ijk in Iterators.ProductIterator(axes(A))]
key_collect(A) = [@inbounds(A(vals...)) for vals in Iterators.ProductIterator(axiskeys(A))]

bigmat = wrapdims(rand(100,100), 1:100, 1:100);
bigmat2 = wrapdims(rand(100,100), collect(1:100), collect(1:100));

@btime ind_collect($(bigmat.data)); #  9.117 μs (4 allocations: 78.25 KiB) (12.646 μs (4 allocations: 78.25 KiB))
@btime ind_collect($bigmat);        # 11.530 μs (4 allocations: 78.25 KiB) (9.996 μs (4 allocations: 78.25 KiB))
@btime key_collect($bigmat);        # 64.064 μs (4 allocations: 78.27 KiB) (1.248 ms (10004 allocations: 390.77 KiB))
@btime key_collect($bigmat2);      # 718.804 μs (5 allocations: 78.27 KiB) (625.003 μs (5 allocations: 78.27 KiB))

twomat = wrapdims(mat.data, x=[:a, :b, :c], y=21:24)
@btime $twomat(x=:a, y=24)  # 36.734 ns (2 allocations: 64 bytes) (137.505 ns (2 allocations: 64 bytes))

@btime $twomat(24.0)        # 26.686 ns (4 allocations: 112 bytes) (44.374 ns (5 allocations: 144 bytes))
@btime $twomat(y=24.0)      # 33.860 ns (4 allocations: 112 bytes) (49.411 ns (6 allocations: 160 bytes))
@btime view($twomat, :,3)   # 24.951 ns (4 allocations: 112 bytes) (22.302 ns (4 allocations: 112 bytes))

Here's the version of everything from my instantiation:

(AxisKeys) pkg> st                                                                                                                     
Project AxisKeys v0.1.16
Status `~/Documents/AxisKeys.jl/Project.toml`
  [621f4979] AbstractFFTs v1.0.1
  [587fd27a] CovarianceEstimation v0.2.6
  [8197267c] IntervalSets v0.5.3
  [41ab1584] InvertedIndices v1.0.0
  [1fad7336] LazyStack v0.0.7
  [356022a1] NamedDims v0.2.29
  [6fe1bfb0] OffsetArrays v1.10.0
  [2913bbd2] StatsBase v0.33.8
  [bd369af6] Tables v1.4.3
  [37e2e46d] LinearAlgebra 
  [10745b16] Statistics

I'm on a Thinkpad X1 Carbon 7th gen running Ubuntu 20.04. Here's the CPU/RAM info from lshw:

H/W path             Device          Class          Description
===============================================================
                                     system         20QDS3DQ00 (LENOVO_MT_20QD_BU_Think_FM_ThinkPad X1 Carbon
/0                                   bus            20QDS3DQ00
/0/2                                 memory         16GiB System Memory
/0/2/0                               memory         8GiB Row of chips LPDDR3 Synchronous 2133 MHz (0.5 ns)
/0/2/1                               memory         8GiB Row of chips LPDDR3 Synchronous 2133 MHz (0.5 ns)
/0/c                                 memory         256KiB L1 cache
/0/d                                 memory         1MiB L2 cache
/0/e                                 memory         8MiB L3 cache
/0/f                                 processor      Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
/0/11                                memory         128KiB BIOS
/0/100                               bridge         Coffee Lake HOST and DRAM Controller
/0/100/2                             display        UHD Graphics 620 (Whiskey Lake)
/0/100/4                             generic        Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal
/0/100/8                             generic        Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Pr

reduce(cat, xs) drops keys `cat(xs...)` is all good though

julia> xs = [KeyedArray(i.*[1,2,3], foo=[:a,:b,:c]) for i in 1:4];

julia> hcat(xs...)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   foo ∈ 3-element Vector{Symbol}
→   _ ∈ 4-element OneTo{Int}
And data, 3×4 Matrix{Int64}:
        (1)  (2)  (3)  (4)
  (:a)    1    2    3    4
  (:b)    2    4    6    8
  (:c)    3    6    9   12

julia> reduce(hcat, xs)
3×4 Matrix{Int64}:
 1  2  3   4
 2  4  6   8
 3  6  9  12

I suspect there is also a NamedDims Array bug since it also drops names

vcat doesn't drop names, just keys..
though also it is interesting to note vcat(ys...) it allows duplicate keys, which is probably not a feature

julia> ys = [KeyedArray(i.*[1,2,3]', bar=[:x], foo=[:a,:b,:c]) for i in 1:4];

julia> vcat(ys...)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   bar ∈ 4-element Vector{Symbol}
→   foo ∈ 3-element Vector{Symbol}
And data, 4×3 Matrix{Int64}:
        (:a)  (:b)  (:c)
  (:x)   1     2     3
  (:x)   2     4     6
  (:x)   3     6     9
  (:x)   4     8    12

julia> reduce(vcat, ys)
4×3 NamedDimsArray(::Matrix{Int64}, (:bar, :foo)):
       → foo
↓ bar  1  2   3
       2  4   6
       3  6   9
       4  8  12

stackoverflow in zero-dimensional indexing

Zero-dimensional KeyedArray works mostly fine:

julia> using AxisKeys
julia> ka0 = KeyedArray(fill(123));
julia> ka0[1]
123
julia> size(ka0)
()
julia> ndims(ka0)
0
julia> axiskeys(ka0)
()

However, the corresponding zero-dimensional indexing fails with a StackOverflowError:

julia> ka0[]
ERROR: StackOverflowError:
Stacktrace:
     [1] getindex(A::KeyedArray{Int64, 0, NamedDimsArray{(), Int64, 0, Array{Int64, 0}}, Tuple{}})
       @ AxisKeys ~/.julia/dev/AxisKeys/src/names.jl:83
     [2] getindex(A::KeyedArray{Int64, 0, NamedDimsArray{(), Int64, 0, Array{Int64, 0}}, Tuple{}}; kw::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
       @ AxisKeys ~/.julia/dev/AxisKeys/src/names.jl:85
--- the last 2 lines are repeated 39990 more times ---
 [79983] getindex(A::KeyedArray{Int64, 0, NamedDimsArray{(), Int64, 0, Array{Int64, 0}}, Tuple{}})
       @ AxisKeys ~/.julia/dev/AxisKeys/src/names.jl:83

I'm not familiar with this kind of internals, but looks like a wrong getindex method gets called somehow.

For comparison, 0-d arrays in base julia:

julia> a0 = fill(123)
0-dimensional Array{Int64, 0}:
123
julia> a0[]
123
julia> a0[1]
123

Show list of names in error message

It would be helpful to show the list of names in the following error message:

  Got exception outside of a @test
  some keywords not in list of names!
  Stacktrace:
   [1] error(::String) at ./error.jl:33
   [2] #getkey#30 at /Users/x/.julia/packages/AxisKeys/YGuuO/src/names.jl:100 [inlined]
   [3] #_#28 at /Users/x/.julia/packages/AxisKeys/YGuuO/src/names.jl:95 [inlined]
   [4] macro expansion at /Users/x/.julia/packages/MutableArithmetics/bPWR4/src/rewrite.jl:276 [inlined]
   [5] macro expansion at /Users/x/.julia/packages/JuMP/y5vgk/src/macros.jl:447 [inlined]

Tag 0.1.5

I need the keys fix from 0.1.5 for invenia/Impute.jl#63

Should subtypes of `Factorization` maintain keys?

I'm wondering if we should try to maintain the keys on the components of Factorization types like SVD and Cholesky? I realize it wouldn't apply to all components, but for upper and lower matrices it might be a reasonable to avoid prematurely dropping keys?

Keep lazy loading of data in mind?

Hi, just a short notice & a thing one might keep in mind: In case this package is used to wrap some sort of disk-backed array which supports lazy loading, the show function loads eagerly from disk which can be a problem for big datasets. See https://github.com/rafaqz/GeoData.jl/issues/89

add Tables.table(::KeyedArray) method

You have now a method defined only for type, but a method for value produces a wrong result due to a default method in Tables.jl for matrices.

Really long display when lookup produces views

julia> using Dates, AxisKeys

julia> arr = KeyedArray(rand(20,1000), channel=1:20, time=Second.(1:1000))
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   channel ∈ 20-element UnitRange{Int64}
→   time ∈ 1000-element Vector{Second}
And data, 20×1000 Matrix{Float64}:
        Second(1)    Second(2)    Second(3)    Second(4)    Second(5)   …   Second(996)    Second(997)    Second(998)    Second(999)    Second(1000) 
  (1)   0.174294     0.388382     0.289075     0.350618     0.46601         0.417912       0.0243023      0.0618601      0.93429        0.923933
  (2)   0.994071     0.596008     0.498769     0.0596791    0.498672        0.378613       0.824311       0.304804       0.807597       0.772333
  (3)   0.636448     0.0680218    0.378777     0.109549     0.705696        0.562759       0.603039       0.797527       0.969418       0.783946
  (4)   0.327567     0.403154     0.831636     0.990698     0.709309        0.837483       0.54263        0.322894       0.38148        0.753664
  (5)   0.171328     0.919451     0.695422     0.306574     0.686058    …   0.153861       0.218536       0.581365       0.200831       0.115166
    ⋮                                                       ⋮           ⋱                                                               ⋮
 (15)   0.110868     0.571425     0.652506     0.191932     0.801828        0.352494       0.473893       0.792803       0.288452       0.222457
 (16)   0.487455     0.738222     0.845043     0.623517     0.612191        0.219844       0.577306       0.737975       0.576429       0.940439
 (17)   0.185447     0.447439     0.871597     0.533874     0.105965        0.349765       0.686086       0.0721533      0.903647       0.658724
 (18)   0.314007     0.114403     0.559865     0.168543     0.893155    …   0.942225       0.978142       0.400859       0.139279       0.688163
 (19)   0.055023     0.857879     0.238967     0.703954     0.736718        0.117123       0.425706       0.00558212     0.348404       0.318862
 (20)   0.0554815    0.781287     0.0346575    0.0494349    0.353736        0.035409       0.0915497      0.232446       0.985577       0.267258

julia> arr(time=Interval(Second(10), Second(100)))
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   channel ∈ 20-element UnitRange{Int64}
→   time ∈ 91-element view(::Vector{Second},...)
And data, 20×91 view(::Matrix{Float64}, :, [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]) with eltype Float64:
        Second(10)    Second(11)    Second(12)    Second(13)    Second(14)   …   Second(96)    Second(97)    Second(98)    Second(99)    Second(100) 
  (1)   0.401911      0.976594      0.887549      0.231247      0.85271          0.724854      0.298616      0.514872      0.305565      0.0138886
  (2)   0.246357      0.260172      0.948085      0.317089      0.324393         0.728641      0.731313      0.309075      0.932554      0.470702
  (3)   0.0372831     0.888633      0.162363      0.198037      0.0639279        0.686144      0.756563      0.190438      0.489748      0.467288
  (4)   0.394226      0.903272      0.563503      0.872159      0.119905         0.0735394     0.440849      0.867434      0.684596      0.178688
  (5)   0.765961      0.177865      0.953313      0.207974      0.00197922   …   0.00815044    0.419335      0.318704      0.714344      0.234098
    ⋮                                                           ⋮            ⋱                                             ⋮            
 (15)   0.965539      0.100079      0.302711      0.0708196     0.98928          0.670222      0.426443      0.254453      0.725294      0.424728
 (16)   0.514395      0.337786      0.582278      0.452007      0.833571         0.776734      0.343318      0.549231      0.911476      0.372341
 (17)   0.265192      0.21141       0.996767      0.130316      0.828896         0.910629      0.558763      0.735394      0.128649      0.612153
 (18)   0.210489      0.913829      0.477584      0.815747      0.996621     …   0.927536      0.054949      0.24776       0.934392      0.476383
 (19)   0.192422      0.0438443     0.461826      0.859082      0.361938         0.800272      0.62898       0.953353      0.0982186     0.0356979
 (20)   0.0860791     0.0983406     0.115338      0.720953      0.610357         0.337986      0.121038      0.0860391     0.625933      0.794514

Here you can see the whole vector of indices gets printed. As you can imagine, this can get really long!

sortkeys() changes key container type

I think it is reasonable to expect that sorting keys should not change the key container type. This is not currently the case:

using AxisKeys, UniqueVectors

a = wrapdims(rand(2), UniqueVector, x=1:2)

println(typeof(axiskeys(a,:x)))

a = sortkeys(a)

println(typeof(axiskeys(a,:x)))

Produces:

UniqueVector{Int64}
Array{Int64,1}

Make AxisKeys play nicely with AcceleratedArrays

Due to how lookups are handled in AK, it seems like in many cases the acceleration of AcceleratedArrays isn't used. For example:

using Dates
using AcceleratedArrays
using AxisKeys

dates = accelerate(Date(2018, 1, 1):Day(1):Date(2019, 2, 3), UniqueSortIndex)

sd = Date(2018, 3, 3)
ed = Date(2018, 8, 4)

i = findall(in(AcceleratedArrays.Interval(sd, ed)), dates) 
println(i)
@assert isa(i, StepRange) # nice, just 2 searchsorteds happened

A = KeyedArray(rand(length(dates)); date = dates)

# Not happy: AxisKeys is blind to AcceleratedArrays.Interval, crashes
# A(AcceleratedArrays.Interval(sd, ed), :)

# Works, but inefficient - the view created is not a range, but a Vector of Ints, since it just does Interval inclusion in a loop:
# https://github.com/mcabbott/AxisKeys.jl/blob/master/src/selectors.jl#L6
A(AxisKeys.Interval(sd, ed), :)

Could the lookup logic in AK be abstracted a bit, to allow other libraries to implement efficient axis lookup for their types? I think findall function is probably the correct interface, for example: https://github.com/kcajf/AcceleratedArrays.jl/blob/master/src/UniqueSortIndex.jl#L135. The currently implemented lookup logic could be reimplemented in terms of findall too, as a fallback.

mapreduce is ignoring kwargs

So far I have only seen this with JuMP

setup:

using JuMP, AxisKeys

m = Model()
@variable(m, a <= 10)
@variable(m, b <= 10)

k = KeyedArray([(a+1) (b+2); (a+b) (a+2b)]; foo=[:f1, :f2], bar=[:g1, :g2])

demonstration of mutation of contents with Keyed array

julia> k = KeyedArray([(a+1) (b+2); (a+b) (a+2b)]; foo=[:f1, :f2], bar=[:g1, :g2])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   foo ∈ 2-element Vector{Symbol}
→   bar ∈ 2-element Vector{Symbol}
And data, 2×2 Array{GenericAffExpr{Float64,VariableRef},2}:
         (:g1)    (:g2)
  (:f1)    a + 1    b + 2
  (:f2)    a + b    a + 2 b

julia> sum(k)
3 a + 4 b + 3

julia> k
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   foo ∈ 2-element Vector{Symbol}
→   bar ∈ 2-element Vector{Symbol}
And data, 2×2 Array{GenericAffExpr{Float64,VariableRef},2}:
         (:g1)            (:g2)
  (:f1)    3 a + 4 b + 3    b + 2
  (:f2)    a + b            a + 2 b

demonstration of this not happening when working with backing data

julia> k = KeyedArray([(a+1) (b+2); (a+b) (a+2b)]; foo=[:f1, :f2], bar=[:g1, :g2])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   foo ∈ 2-element Vector{Symbol}
→   bar ∈ 2-element Vector{Symbol}
And data, 2×2 Array{GenericAffExpr{Float64,VariableRef},2}:
         (:g1)    (:g2)
  (:f1)    a + 1    b + 2
  (:f2)    a + b    a + 2 b

julia> d = getfield(k, :data)
2×2 NamedDimsArray(::Array{GenericAffExpr{Float64,VariableRef},2}, (:foo, :bar)):
       → bar
↓ foo  a + 1  b + 2
       a + b  a + 2 b

julia> sum(d)
3 a + 4 b + 3

julia> k
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   foo ∈ 2-element Vector{Symbol}
→   bar ∈ 2-element Vector{Symbol}
And data, 2×2 Array{GenericAffExpr{Float64,VariableRef},2}:
         (:g1)    (:g2)
  (:f1)    a + 1    b + 2
  (:f2)    a + b    a + 2 b

view doesn't support indexing by key

For instance, we can view a KeyedArray by passing in the appropriate indices, but if we wanted to use the corresponding keys instead that isn't possible. Of course, calling KA(:a) is quite convenient but Base.view is often called in packages that expect an AbstractArray, so it would be nice if that was compatible with passing in keys instead of indices.

julia> KA = KeyedArray(rand(3, 5), features=[:a, :b, :c], time=1:5);

julia> view(KA, 1, :)
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   time ∈ 5-element UnitRange{Int64}
And data, 5-element view(::Matrix{Float64}, 1, :) with eltype Float64:
 (1)  0.4641547335476697
 (2)  0.739015103011146
 (3)  0.34302619878129414
 (4)  0.9326806444048761
 (5)  0.645438413359315

julia> view(KA, :a, :)
ERROR: ArgumentError: invalid index: :a of type Symbol
Stacktrace:
 [1] to_index(i::Symbol)
   @ Base ./indices.jl:300
 [2] to_index(A::KeyedArray{Float64, 2, NamedDimsArray{(:features, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{Symbol}, UnitRange{Int64}}}, i::Symbol)
   @ Base ./indices.jl:277
 [3] to_indices
   @ ./indices.jl:333 [inlined]
 [4] to_indices
   @ ./indices.jl:324 [inlined]
 [5] view(::KeyedArray{Float64, 2, NamedDimsArray{(:features, :time), Float64, 2, Matrix{Float64}}, Tuple{Vector{Symbol}, UnitRange{Int64}}}, ::Symbol, ::Function)
   @ AxisKeys ~/.julia/packages/AxisKeys/VS8DQ/src/struct.jl:75
 [6] top-level scope
   @ REPL[73]:1

show broken on julia 1.7

If I create the example KeyedArray from the readme in julia 1.7, I get an error on displaying it

using AxisKeys
data = rand(Int8, 2,10,3) .|> abs;
A = KeyedArray(data; channel=[:left, :right], time=range(13, step=2.5, length=10), iter=31:33)

3-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   channel ∈ 2-element Vector{Symbol}
→   time ∈ 10-element StepRangeLen{Float64,...}
□   iter ∈ 3-element UnitRange{Int64}
And data, 2×10×3 Array{Int8, 3}:
[:, :, 1] ~ (:, :, 31):
 ERROR: MethodError: no method matching ncodeunits(::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}})
Closest candidates are:
  ncodeunits(::SubString) at strings/substring.jl:64
  ncodeunits(::SubstitutionString) at regex.jl:548
  ncodeunits(::Char) at char.jl:65
  ...
Stacktrace:
  [1] show(io::IOContext{IOBuffer}, mime::MIME{Symbol("text/plain")}, str::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}}; limit::Nothing)
    @ Base .\strings\io.jl:209
  [2] show
    @ .\strings\io.jl:201 [inlined]
  [3] show(io::IOContext{IOBuffer}, m::String, x::AxisKeys.ShowWith{Int64, NamedTuple{(), Tuple{}}})
    @ Base.Multimedia .\multimedia.jl:111
  [4] sprint(::Function, ::String, ::Vararg{Any}; context::IOContext{Base.TTY}, sizehint::Int64)
    @ Base .\strings\io.jl:110
  [5] print_matrix_row(io::IOContext{Base.TTY}, X::AbstractVecOrMat, A::Vector{Tuple{Int64, Int64}}, i::Int64, cols::Vector{Int64}, sep::String)
    @ Base .\arrayshow.jl:106
  [6] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base .\arrayshow.jl:210
  [7] print_matrix(io::IOContext{Base.TTY}, X::Matrix{Any}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64) (repeats 2 times)
    @ Base .\arrayshow.jl:169
  [8] keyed_print_matrix(io::IOContext{Base.TTY}, A::KeyedArray{Int8, 2, NamedDimsArray{(:channel, :time), Int8, 2, SubArray{Int8, 2, Array{Int8, 3}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Int64}, false}}, Tuple{SubArray{Symbol, 1, Vector{Symbol}, Tuple{Base.OneTo{Int64}}, true}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}, reduce_size::Bool)
    @ AxisKeys C:\Users\visser_mn\.julia\dev\AxisKeys\src\show.jl:117
  [9] keyed_print_matrix
    @ C:\Users\visser_mn\.julia\dev\AxisKeys\src\show.jl:77 [inlined]
 [10] limited_show_nd(io::IOContext{Base.TTY}, a::KeyedArray{Int8, 3, NamedDimsArray{(:channel, :time, :iter), Int8, 3, Array{Int8, 3}}, Tuple{Vector{Symbol}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}, UnitRange{Int64}}}, print_matrix::typeof(AxisKeys.keyed_print_matrix), label_slices::Bool)
    @ AxisKeys C:\Users\visser_mn\.julia\dev\AxisKeys\src\show.jl:235
 [11] show_nd
    @ C:\Users\visser_mn\.julia\dev\AxisKeys\src\show.jl:171 [inlined]

This is working fine for me on 1.6. There seem to be no tests for show however, so the tests are passing fine on 1.7.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

License

Hi, I just noticed there is no license in this repository. I guess this is an oversight? Otherwise it's under exclusive copyright by default (https://choosealicense.com/no-permission/).

I would suggest MIT like most other julia packages, and am ok with licensing my (tiny) contribution as MIT.

One easy way to do this is by filling in the repo url on the right hand side of https://choosealicense.com/licenses/mit/, under "Suggest this license".

BTW, I quickly went over a few of your other repos, and there are more with a missing license, e.g.:
https://github.com/mcabbott/Tullio.jl
https://github.com/mcabbott/LazyStack.jl
https://github.com/mcabbott/Bhaskara.jl
https://github.com/mcabbott/TransmuteDims.jl

Add an `align` function

http://xarray.pydata.org/en/stable/generated/xarray.align.html?highlight=align#xarray-align

Register

Perhaps it's time: @JuliaRegistrator register

What to call things

This package isn't registered yet, partly because what to call things is unclear. Thoughts:

Julia has adopted index::Int ∈ axes(A,1). So I think these other structures need other names.
From NamedDims.jl, each name::Symbol belongs to a dimension, d ∈ 1:ndims(A), which is fine. It uses dimnames(A) to give the list now, not Base.names(A).
A value is what A[1,2] returns, this is what setindex! calls it.
LabelledArrays.jl attaches a label::Symbol to every element of the array,
so that perhaps A.label == A[2,2].
For now I use key for the thing you lookup. But Base.keys(A) appears to be another way of saying CartesianIndices(A).

But what's a vector of keys called? For now I use ranges(A,1). But these do not have to be like Base.range's AbstractRanges. Perhaps scales(A,1)? A scale of keys? domains(A,1)? Or axiskeys(A, 1) since they are labels along each axis, not in the bulk?

The package could then be AxisScales (hard to say) or LabelledDims (how many ls again?)
or AxisKeys (short!) or ScaledDims (but not just re-scaling) or what?

consider a character other than `□` to label 3rd dimension

Every time I see □ I think "oh a unicode character that my font is missing a glyph for", even though I know at an intellectual level it's referring to the "sheets" of the array. I'm not sure there's a better option though. @ericphanson suggested that ↗ might work since it's "going off in another direction", or alteratively that we "choose a random arrow from https://www.sascha-frank.com/Arrow/latex-arrows.html every time"

Index eltypes unioned on empty `KeyedArray`

There seems to be a weird bug where the axiskey eltypes get unionized when converting a KeyedArray with an empty dimension to a table. Perhaps we should define a Tables.schema method for a KeyedArray to avoid this problem?

julia> ka = KeyedArray(rand(0, 3); x=1:0, y=[:a, :b, :c])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   x ∈ 0-element UnitRange{Int64}
→   y ∈ 3-element Vector{Symbol}
And data, 0×3 Matrix{Float64}

julia> Tables.schema(DataFrame(ka))
Tables.Schema:
 :x      Union{Int64, Symbol}
 :y      Union{Int64, Symbol}
 :value  Float64
 
 julia> ka = KeyedArray(rand(0, 3, 2); x=1:0, y=[:a, :b, :c], z=["foo", "bar"])
3-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   x ∈ 0-element UnitRange{Int64}
→   y ∈ 3-element Vector{Symbol}
□   z ∈ 2-element Vector{String}
And data, 0×3×2 Array{Float64, 3}

julia> Tables.schema(DataFrame(ka))
Tables.Schema:
 :x      Union{Int64, String, Symbol}
 :y      Union{Int64, String, Symbol}
 :z      Union{Int64, String, Symbol}
 :value  Float64
 
 julia> eltype.(axiskeys(ka))
(Int64, Symbol, String)

matmul test fails on Julia 1.5

In the list in https://github.com/JuliaCI/NanosoldierReports/blob/18bb6d85b20e25f1a9f2f312963f4b8b4fe7082e/pkgeval/by_hash/a340bf1_vs_3c50b7f/report.md I noticed AxisKeys tests are failing on julia 1.5.

Not sure if it is an AxisKeys or NamedDims issue, but here is the error: https://github.com/JuliaCI/NanosoldierReports/blob/18bb6d85b20e25f1a9f2f312963f4b8b4fe7082e/pkgeval/by_hash/a340bf1_vs_3c50b7f/logs/AxisKeys/1.5.0-DEV-fc8aa8666a.log#L140

Indexing 2D with 1D index fails in weird ways

Consider:

julia> k = KeyedArray(rand(3,3); foo=string.("a", 1:3), bar=string.("b", 1:3));

julia> parent(k)[1:end]
9-element NamedDimsArray(::Vector{Float64}, (:foo,)):
↓ foo  0.859107332615958
       0.9989996793319562
       0.790285459376431
       0.16385806804700676
       0.07446542649577914
       0.3149899958661786
       0.6442683596770953
       0.29372899426989774
       0.5290965960710021

julia> parent(k)[1:2:end]
5-element NamedDimsArray(::Vector{Float64}, (:foo,)):
↓ foo  0.859107332615958
       0.790285459376431
       0.07446542649577914
       0.6442683596770953
       0.5290965960710021

All very happy running on the NamedDims array that backs the KeyedArray.

But now when i try to do the same on the keyed array

julia> k[1:end]
ERROR: MethodError: no method matching keys_getindex(::Tuple{Vector{String}}, ::Tuple{})
  ...
Stacktrace:
 [1] keys_getindex
   @ ~/.julia/packages/AxisKeys/iJ0Vf/src/struct.jl:109 [inlined]
 [2] getindex(A::KeyedArray{Float64, 2, NamedDimsArray{(:foo, :bar), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{String}}}, raw_inds::UnitRange{Int64})
   @ AxisKeys ~/.julia/packages/AxisKeys/iJ0Vf/src/struct.jl:80
 [3] top-level scope
   @ REPL[39]:1
   
julia> k[1:2:end]
ERROR: BoundsError: attempt to access 3-element Vector{String} at index [5]
Stacktrace:
 [1] getindex
   @ ./array.jl:801 [inlined]
 [2] getindex(A::Vector{String}, I::StepRange{Int64, Int64})
   @ Base ./array.jl:826
 [3] keys_getindex
   @ ~/.julia/packages/AxisKeys/iJ0Vf/src/struct.jl:107 [inlined]
 [4] getindex(A::KeyedArray{Float64, 2, NamedDimsArray{(:foo, :bar), Float64, 2, Matrix{Float64}}, Tuple{Vector{String}, Vector{String}}}, raw_inds::StepRange{Int64, Int64})
   @ AxisKeys ~/.julia/packages/AxisKeys/iJ0Vf/src/struct.jl:80
 [5] top-level scope
   @ REPL[40]:1

It does not go so well

A special case of this failure with step ranges is k[diagind(k)] to access the diagonal

Unable to use Tuples to label a KeyedArray

Trying to construct an array where the labels are stored in a tuple throws an error:

  MethodError: no method matching axes(::Tuple{Symbol, Symbol, Symbol}, ::Int64)
  Closest candidates are:
    axes(::Tuple) at tuple.jl:28
    axes(::TArray, ::Any...) at /home/lime/.julia/packages/Libtask/RQkfZ/src/tarray.jl:198
    axes(::DataFrames.GroupKey, ::Integer) at /home/lime/.julia/packages/DataFrames/vuMM8/src/groupeddataframe/groupeddataframe.jl:517

The code that generated this error --

table = KeyedArray(data; model=model_names, statistic=(:cv_diff, :se_cv_diff, :weight))

I'm guessing this is unintended, and the method is accidentally overtyped. If it's intentional, KeyedArray should probably be typed in such a way as to to avoid passing a tuple in the first place

findmin/max when stating dimensions

Is there are particular way of making this (below) work?
findmax(test_subject, dims=1)

ERROR: Base.keys(::KeyedArray) not defined, please open an issue if this happens unexpectedly.
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] keys(::AxisKeys.KeyedArray{Float64,3,NamedDims.NamedDimsArray{(:time, :channels, :trials),Float64,3,Array{Float64,3}},Tuple{Array{Float64,1},Array{Symbol,1},UnitRange{Int64}}}) at C:\Users\asimd\.julia\packages\AxisKeys\rvne0\src\struct.jl:57
 [3] _findmax(::AxisKeys.KeyedArray{Float64,3,NamedDims.NamedDimsArray{(:time, :channels, :trials),Float64,3,Array{Float64,3}},Tuple{Array{Float64,1},Array{Symbol,1},UnitRange{Int64}}}, ::Int64) at .\reducedim.jl:827
 [4] findmax(::AxisKeys.KeyedArray{Float64,3,NamedDims.NamedDimsArray{(:time, :channels, :trials),Float64,3,Array{Float64,3}},Tuple{Array{Float64,1},Array{Symbol,1},UnitRange{Int64}}}; dims::Int64) at .\reducedim.jl:817
 [5] top-level scope at REPL[52]:1```

Printing with StaticArrays

julia> using StaticArrays, AxisKeys

julia> KeyedArray(SA[1 2; 3 4; 5 6], alpha='a':'c', beta=SA[10,20])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   alpha ∈ 3-element StepRange{Char,...}
→   beta ∈ 2-element SArray{Tuple{2},...}
And data, 3×2 SArray{Tuple{3,2},Int64,2,6} with indices SOneTo(3)×SOneTo(2):
         (10)  (20)
  ('a')     1     2
  ('b')     3     4
  ('c')     5     6

julia> axes(ans)
(SOneTo(3), SOneTo(2))

julia> KeyedArray(SA[1,2,3], alpha='a':'c');

julia> ans
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   alpha ∈ 3-element StepRange{Char,...}
And data, 3-element SArray{Tuple{3},Int64,1,3} with indices SOneTo(3):
Error showing value of type KeyedArray{Int64,1,NamedDimsArray{(:alpha,),Int64,1,SArray{Tuple{3},Int64,1,3}},Base.RefValue{StepRange{Char,Int64}}}:
ERROR: BoundsError: attempt to access ()
  at index [1]
Stacktrace:
     Function             Module           Signature
     ────────             ──────           ─────────
[1]  getindex             Base             (::Tuple, ::Int64)
at: /Applications/Julia-1.4.app/Contents/Resources/julia/bin/../share/julia/base/tuple.jl:24
[2]  map [i]              Base             
at: /Applications/Julia-1.4.app/Contents/Resources/julia/bin/../share/julia/base/tuple.jl:180
[3]  index_sizes [i]      StaticArrays     
at: ~/.julia/packages/StaticArrays/1g9bq/src/indexing.jl:80
[4]  getindex             StaticArrays     (::SArray{Tuple{3},Int64,1,3}, ::Colon, ::Colon)
at: ~/.julia/packages/StaticArrays/1g9bq/src/indexing.jl:219
[5]  keyed_print_matrix   AxisKeys         (::IOContext{REPL.Terminals.TTYTerminal}, ::KeyedArray{Int64,1,NamedDimsArray{(:alpha,),Int64,1,SArray{Tuple{3},Int64,1,3}},Base.RefValue{StepRange{Char,Int64}}}, ::Bool)

Add `skip_missing` keyword to `populate!`

In some cases you may want to fill a pre-allocated KeyedArray with table data and have it ignore key values that aren't in your pre-allocated array. This might be a nice way to gracefully avoid erroring here when the key indices aren't present?

Tables.columntable return value

Normally I would assume that Tables.columntable would return a NamedTuple of Tables.CopiedColumns or a NamedTuple of views (depending on your design assumptions).

Now it returns a NamedTuple of vectors which is a bit problematic, because downstream methods (in particular DataFrames.jl) do not know if upon creation of the object columns should be copied or not.

I think (but we can discuss) that it would be better to create a view, so that Tables.columntable is non-allocating. What do you think?

Unable to Index into AxisKeys from Inside Package

I consistently get the following error when I try to load a package that uses AxisKeys:

ERROR: LoadError: LoadError: syntax: ":Estimate" is not a valid function argument name around /home/lime/.julia/dev/ParetoSmooth/src/LeaveOneOut.jl:69

The relevant code snippet is here:

    tbl = KeyedArray(similar(log_likelihood, 3, 2); 
        crit=[:loo, :p_loo, :loo_ic],
        est=[:Estimate, :SE],
        )
    
    # Use Bayesian bootstrap to build confidence intervals
    tbl(:Estimate, :loo) = mean(pointwise_ev)
    tbl(:Estimate, :p_loo) = sum(pointwise_p_eff)
    tbl(:Estimate, :loo_ic) = -2 * ev_loo

    tbl(:SE, :loo) = se = sqrt(varm(pointwise_ev, ev_loo) / data_size)
    tbl(:SE, :p_loo) = sqrt(varm(pointwise_p_eff, p_eff / data_size) * data_size)
    tbl(:SE, :loo_ic) = 2 * se

Neither lookup with parentheses, nor indexing with square brackets, works. Replacing parentheses with square brackets instead results in ERROR: ArgumentError: invalid index: :Estimate of type Symbol when running the code.

overload rename?

I would like to be able to rename Axes sometimes but wthout changing there keys.

Maybe should overload NamedDims.rename and reexport?

Wrappers for StatsBase

I realize this is introducing an extra dependency, but it'd be nice if you could do things like:

μ, σ = mean_and_std(A, wv, :channel; corrected=true)

Support more factorizations

Basically same as invenia/NamedDims.jl#24 and I think similar solution would work.
I want to preserve my keys on the nonlatent dimensions of factorizations.

Add wrapper for `eachslice`?

I'm not sure what the best approach for this is, but it'd be nice if you could do:

eachslice(A; dims=[:channel])`

rather than

eachslice(A; dims=[dim(dimnames(A), :channel)])

Version of `reshape` that preserves keys?

I'm not sure how helpful this would be generally, but a lot of ML APIs restrict input arrays to matrices. It'd be kind of nice if we could support at least a subset of the reshape behaviour such that axis keys are merged. For example:

julia> ka = KeyedArray(rand(4, 3, 2); time=1:4, obj=[:a, :b, :c], loc=[:x, :y])
3-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   time ∈ 4-element UnitRange{Int64}
→   obj ∈ 3-element Vector{Symbol}
□   loc ∈ 2-element Vector{Symbol}
And data, 4×3×2 Array{Float64,3}:
[:, :, 1] ~ (:, :, :x):
      (:a)       (:b)        (:c)
 (1)   0.416197   0.327252    0.14608
 (2)   0.706717   0.0045184   0.055459
 (3)   0.487265   0.879403    0.121894
 (4)   0.156394   0.431853    0.0756667

[:, :, 2] ~ (:, :, :y):
      (:a)       (:b)       (:c)
 (1)   0.507      0.803645   0.411088
 (2)   0.92779    0.284998   0.418833
 (3)   0.137591   0.415834   0.194712
 (4)   0.785161   0.436941   0.996514

julia> reshape(ka, 4, :)
4×6 Array{Float64,2}:
 0.416197  0.327252   0.14608    0.507     0.803645  0.411088
 0.706717  0.0045184  0.055459   0.92779   0.284998  0.418833
 0.487265  0.879403   0.121894   0.137591  0.415834  0.194712
 0.156394  0.431853   0.0756667  0.785161  0.436941  0.996514

I feel like in these cases it would be nice if we could get something like:

julia> KeyedArray(reshape(ka, 4, :); time=1:4, obj_loc=[:a_x, :b_x, :c_x, :a_y, :b_y, :c_y])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   time ∈ 4-element UnitRange{Int64}
→   obj_loc ∈ 6-element Vector{Symbol}
And data, 4×6 Array{Float64,2}:
      (:a_x)     (:b_x)      (:c_x)      (:a_y)     (:b_y)     (:c_y)
 (1)   0.416197   0.327252    0.14608     0.507      0.803645   0.411088
 (2)   0.706717   0.0045184   0.055459    0.92779    0.284998   0.418833
 (3)   0.487265   0.879403    0.121894    0.137591   0.415834   0.194712
 (4)   0.156394   0.431853    0.0756667   0.785161   0.436941   0.996514

Either with reshape directly or at least with a separate, more restrictive, function call. If this seems like it could be useful for other folks I'm happy to open a PR with a suggested function name?

Support for PDMats

I would like KeyedArrays to play nice as parameters of Distributions.

This already works in most cases, but there are problems with PDMats.

There is a full interface to define for new subtypes of AbstractPDMat, but I don't know if all of that is warranted.

Constructor example

julia> using AxisKeys, Distributions, PDMats

julia> pdmat = PDMat([1.0 0.0; 0.0 1.0])
2×2 PDMat{Float64,Array{Float64,2}}:
 1.0  0.0
 0.0  1.0

julia> PDMat(KeyedArray([1.0 0.0; 0.0 1.0], ([:x1, :x2], [:y1, :y2])))
ERROR: MethodError: no method matching PDMat(::KeyedArray{Float64,2,Array{Float64,2},Tuple{Array{Symbol,1},Array{Symbol,1}}})
Closest candidates are:
  PDMat(::AbstractArray{T,2} where T, ::LinearAlgebra.Cholesky{T,S}) where {T, S} at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:12
  PDMat(::LinearAlgebra.Symmetric) at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:20
  PDMat(::Array{T,2} where T) at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:19
  ...
Stacktrace:
 [1] top-level scope at REPL[10]:1

This prevents passing a KeyedArray as a covariance matrix in a Distribution.

julia> d = MvNormal(
           KeyedArray([1.0, 2.0], [:a, :b]),
           KeyedArray([1.0 0.0; 0.0 1.0], ([:a, :b], [:a, :b]))
       )
ERROR: MethodError: no method matching PDMats.PDMat(::KeyedArray{Float64,2,Array{Float64,2},Tuple{Array{Symbol,1},Array{Symbol,1}}})
Closest candidates are:
  PDMats.PDMat(::AbstractArray{T,2} where T, ::LinearAlgebra.Cholesky{T,S}) where {T, S} at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:12
  PDMats.PDMat(::LinearAlgebra.Symmetric) at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:20
  PDMats.PDMat(::Array{T,2} where T) at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:19
  ...
Stacktrace:
 [1] MvNormal(::KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}, ::KeyedArray{Float64,2,Array{Float64,2},Tuple{Array{Symbol,1},Array{Symbol,1}}}) at /Users/bencottier/.julia/packages/Distributions/cNe2C/src/multivariate/mvnormal.jl:211
 [2] top-level scope at REPL[92]:1

`\` operator example

Another case is the \ operator, which is used by e.g. Distributions._logpdf:

julia> pdmat = PDMat([1.0 0.0; 0.0 1.0]);

julia> pdmat \ [1.0, 2.0]
2-element Array{Float64,1}:
 1.0
 2.0

julia> pdmat \ KeyedArray([1.0, 2.0], [:x, :y])
ERROR: MethodError: \(::PDMat{Float64,Array{Float64,2}}, ::KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}) is ambiguous. Candidates:
  \(a::PDMat, x::Union{AbstractArray{T,1}, AbstractArray{T,2}} where T) in PDMats at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:50
  \(x::AbstractArray{T,2} where T, y::KeyedArray{T,1,AT,RT} where RT where AT where T<:Number) in AxisKeys at /Users/bencottier/.julia/packages/AxisKeys/1jgJz/src/functions.jl:303
Possible fix, define
  \(::PDMat, ::KeyedArray{T,1,AT,RT} where RT where AT where T<:Number)
Stacktrace:
 [1] top-level scope at REPL[12]:1

julia> d = MvNormal(KeyedArray([1.0, 2.0], [:a, :b]), [1.0 0.0; 0.0 1.0])
MvNormal{Float64,PDMats.PDMat{Float64,Array{Float64,2}},KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}}(
dim: 2
μ: [1.0, 2.0]
Σ: [1.0 0.0; 0.0 1.0]
)

julia> logpdf(d, [1.0, 2.0])
ERROR: MethodError: \(::LinearAlgebra.LowerTriangular{Float64,Array{Float64,2}}, ::KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}) is ambiguous. Candidates:
  \(A::Union{LinearAlgebra.LowerTriangular, LinearAlgebra.UpperTriangular}, B::AbstractArray{T,1} where T) in LinearAlgebra at /Applications/Julia-1.5.app/Contents/Resources/julia/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2050
  \(x::AbstractArray{T,2} where T, y::KeyedArray{T,1,AT,RT} where RT where AT where T<:Number) in AxisKeys at /Users/bencottier/.julia/packages/AxisKeys/1jgJz/src/functions.jl:303
Possible fix, define
  \(::Union{LinearAlgebra.LowerTriangular, LinearAlgebra.UpperTriangular}, ::KeyedArray{T,1,AT,RT} where RT where AT where T<:Number)
Stacktrace:
 [1] invquad(::PDMat{Float64,Array{Float64,2}}, ::KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}) at /Users/bencottier/.julia/packages/PDMats/Rw2Hf/src/pdmat.jl:79
 [2] sqmahal(::MvNormal{Float64,PDMat{Float64,Array{Float64,2}},KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}}, ::Array{Float64,1}) at /Users/bencottier/.julia/packages/Distributions/cNe2C/src/multivariate/mvnormal.jl:266
 [3] _logpdf(::MvNormal{Float64,PDMat{Float64,Array{Float64,2}},KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}}, ::Array{Float64,1}) at /Users/bencottier/.julia/packages/Distributions/cNe2C/src/multivariate/mvnormal.jl:127
 [4] logpdf(::MvNormal{Float64,PDMat{Float64,Array{Float64,2}},KeyedArray{Float64,1,Array{Float64,1},Base.RefValue{Array{Symbol,1}}}}, ::Array{Float64,1}) at /Users/bencottier/.julia/packages/Distributions/cNe2C/src/multivariates.jl:201
 [5] top-level scope at REPL[18]:1

Package calling `searchsortedfirst` with non sorted ranges

In 1.6.1 the following behavior changed:

1.6.0:

julia> searchsortedfirst(1.0:-0.1:0.1, 0.3)
8

julia> searchsortedfirst(collect(1.0:-0.1:0.1), 0.3)
1

1.6.1:

julia> searchsortedfirst(1.0:-0.1:0.1, 0.3)
1

julia> searchsortedfirst(collect(1.0:-0.1:0.1), 0.3)
1

This causes the test here

AxisKeys.jl/test/_basic.jl

Line 106 in cddf60f

@test [V4[Near(x)] for x in xs] == [V5[Near(x)] for x in xs]

to fail.

The issue seems to be that the package is using searchsortedfirst but the range it passes in is not sorted:

AxisKeys.jl/src/selectors.jl

Line 67 in cddf60f

iplus = searchsortedfirst(range, sel.val)

4|debug> st
In findindex(sel, range) at /Users/kristoffercarlsson/.julia/packages/AxisKeys/G3Okw/src/selectors.jl:66
 66  function findindex(sel::Near, range::AbstractRange)
 67      iplus = searchsortedfirst(range, sel.val)
 68      # "index of the first value in a greater than or equal to x"
>69      if abs(range[iplus]-sel.val) < abs(range[iplus-1]-sel.val)
 70          return iplus
 71      else
 72          return iplus-1
 73      end
 74  end

About to run: (getindex)(1.0:-0.1:0.1, 0)
4|julia> range
1.0:-0.1:0.1

4|julia> issorted(range)
false

Perhaps a rev was forgotten to be passed to searchsortedfirst?

	for fun in [:(==), :isequal, :isapprox]
	for (T, S) in [ (:KeyedArray, :KeyedArray), (:KeyedArray, :NdaKa), (:NdaKa, :KeyedArray) ]
	@eval function Base.$fun(A::$T, B::$S; kw...)
	# Ideally you would pass isapprox(, atol) into unifiable_keys?
	unifiable_keys(axiskeys(A), axiskeys(B)) \|\| return false
	return $fun(keyless(A), keyless(B); kw...)
	end
	end
	end

mcabbott / axiskeys.jl Goto Github PK

axiskeys.jl's People

Contributors

Stargazers

Watchers

Forkers

axiskeys.jl's Issues

setup:

demonstration of mutation of contents with Keyed array

demonstration of this not happening when working with backing data

Constructor example

\ operator example

Recommend Projects

Recommend Topics

Recommend Org

`\` operator example