juliaarrays / axisarrays.jl Goto Github PK
View Code? Open in Web Editor NEWPerformant arrays where each dimension can have a named axis with values
Home Page: http://JuliaArrays.github.io/AxisArrays.jl/latest/
License: Other
Performant arrays where each dimension can have a named axis with values
Home Page: http://JuliaArrays.github.io/AxisArrays.jl/latest/
License: Other
Documenter needs this to be able to push docs to GitHub Pages. Only a repo owner can do this I think.
We're getting to the point where the core API is stabilizing. We can now start making things fancier and building out interfaces to base methods and other packages (like #5). Here's a current brain-dump of some of my thoughts. Additions, critiques and comments are very welcome.
setindex!
(mirroring getindex
's capabilities). (#11)Axis
types to specify the dimension names. It would be an interesting experiment to store the dimension names as Axis
types instead of symbols. I'm not sure if that'd make things simpler or more convoluted. It may be a mixed bag.Date
. Are there other types that satisfy these criteria? What is a sensible name for this trait?Axis
type (#15). Ensuring that large, non-Range Dimensional axes are monotonically increasing can take a long time (we may be able to speed this up some with a special Ordering type, but it's still O(n)). Similarly, ensuring that elements in categorical vectors are unique requires hashing all elements (which could be used for the above hashmap).eachslice
iterator would be nice and useful in and of itself, and if we allow the same sort of syntax and semantics as mapslices
it can serve as the building block for augmenting that Base function.Signals.window
. I was thinking of allowing windowing as an indexing operation (mbauman/Signals.jl#10), but constructing vectors of interval types with deferred promotion (to allow, e.g., windows specified in time about integer indices) has been very challenging.sum
, mean
, maximum
, mapslices
, etc. We can also return AxisArrays with the properly reduced axis set, dropping a dimension and eliminating the type-unstable squeezes.Knowing how these work and how they're used is helpful for implementing custom index types.
I'll submit a PR today.
I'm still just reading the code (and getting a better feel for staged functions!). On @mbauman's question in the code on allowing missing names, I think we should make axes names mandatory or give defaults. However, we should allow the user to leave out the axes (default to nothing for each). A user might only want the Axis() indexing capability. If folks agree, it would be good to flip the order on the following constructor as follows:
stagedfunction AxisArray{T,N}(A::AbstractArray{T,N}, names::(Symbol...), axes::(AbstractVector...))
Then we need defaults of nothing for each axes
. Can you have a staged function with default arguments?
Is there any reason there's currently no constructor to create an uninitialized AxisArray? I.e. something like:
AxisArray{T,N,D}(axes::Axis...)
This seems pretty straightforward, I'm happy to submit a PR - just wanted to check that it wasn't intentionally omitted for some reason.
There's a potential conflict between N
and length(axes)
- presumably N
would be the canonical dimensionality with axes
either truncated or extended (with the standard default dimension names and zero-length?) as needed.
Edit: To avoid the over/under-specification, an alternative would be to deviate from the AxisArray type parameter order specification and use AxisArray{T,D}
or just AxisArray{T}
(where T is a concrete child of AbstractArray). I assume that's frowned upon though - if it's even possible (I'm not clear on how exactly type parametrization works on constructor methods or methods in general, really - any recommended resources for that?).
Hello,
I'd like to know if there is a way to do basic operations on AxisArrays, such as addition of two AxisArray that would have the same axes.
Thanks for any info!
When indexing into an AxisArray
, how can I get the value of the axis at the indexed value?
This becomes relevant for instance in the last example (btw, I PR:ed a version that works after following what you concluded in #127), it would make extra sense (to me at least) when the intensity-weighted mean is performed on the actual axis values, not on the integer indices. So in this specific example the axis are just integers, but if B
was say AxisArray(randn(100,100,100), Axis{:x}(1:100), Axis{:y}(1:100), Axis{:z}(linspace(1mm, 3mm, 100)))
, how would you perform the weighted mean then?
It would be great to have AxisArray-specific concatenation operations - it would seem fairly straightforward to implement for concatenating on existing dimensions, but operating on higher (not-yet-existing) dimensions might need a bit more consideration regarding how to create the new axes. Or maybe one would just be required to supply an appropriate Axis
?
When I started development of this package, it was somewhat of a testbed for new indexing behaviors. I tried to make the package rather opinionated on how I believed indexing should work. One of those opinions was that indexing should return views by default.
Base Julia has caught up (and surpassed) AxisArrays in terms of APL indexing, but it hasn't caught the view train yet. Should AxisArrays take a step back and match behaviors?
Using AxisArrays master:
julia> x = AxisArray(Array{Int}(3), Axis{:t}(-3:-1))
1-dimensional AxisArray{Int64,1,...} with axes:
:t, -3:-1
And data, a 3-element Array{Int64,1}:
0
0
0
julia> x[-3]
ERROR: BoundsError: attempt to access 3-element Array{Int64,1} at index [-3]
Stacktrace:
[1] getindex(::AxisArrays.AxisArray{Int64,1,Array{Int64,1},Tuple{AxisArrays.Axis{:t,UnitRange{Int64}}}}, ::Int64) at /home/mlubin/.julia/v0.6/AxisArrays/src/indexing.jl:41
The corresponding definition of getindex
is:
# Simple scalar indexing where we just set or return scalars
@propagate_inbounds Base.getindex(A::AxisArray, idxs::Int...) = A.data[idxs...]
Is this a bug or a feature? Are integer-indexed axes like -3:-1
not really supported?
The tag name "0.3.0" is not of the appropriate SemVer form (vX.Y.Z).
cc: @mbauman
I've been incorporating AxisArrays into my measurement code and I have got to say, it is really awesome. Great work. One small thing that tripped me up a bit was the following:
julia> A = AxisArray(rand(10,10), Axis{:test1}(0.1:0.1:1.0), Axis{:test2}(0.1:0.1:1.0))
2-dimensional AxisArray{Float64,2,...} with axes:
:test1, 0.1:0.1:1.0
:test2, 0.1:0.1:1.0
And data, a 10×10 Array{Float64,2}:
0.956542 0.269435 0.968573 0.749255 0.931304 0.0717015 0.881473 0.396549 0.23055 0.0553351
0.645476 0.745046 0.127696 0.939494 0.624804 0.209243 0.390447 0.335097 0.0910745 0.258535
0.0884433 0.342941 0.163901 0.521877 0.876765 0.0489701 0.796682 0.365296 0.254204 0.568327
0.205803 0.540466 0.891773 0.704059 0.109656 0.91213 0.0746333 0.802257 0.220095 0.332555
0.0624195 0.0513494 0.863017 0.920406 0.443187 0.566314 0.57402 0.601021 0.0832811 0.355787
0.615837 0.780253 0.697182 0.587102 0.571548 0.698586 0.249986 0.714737 0.253774 0.876721
0.50435 0.790261 0.464834 0.90049 0.0769789 0.395214 0.262674 0.485041 0.645104 0.470171
0.976307 0.477756 0.576011 0.975836 0.664999 0.54624 0.28258 0.970559 0.654953 0.589304
0.102939 0.790315 0.141168 0.208602 0.990242 0.752532 0.0452516 0.687015 0.740944 0.4406
0.700986 0.795746 0.645764 0.979793 0.287919 0.398423 0.0822585 0.895504 0.446748 0.534195
julia> A[Axis{:test2}(0.1)]
ERROR: BoundsError: attempt to access 10×10 Array{Float64,2} at index [Colon(),0.1]
in throw_boundserror(::Array{Float64,2}, ::Tuple{Colon,Float64}) at ./abstractarray.jl:355
in checkbounds at ./abstractarray.jl:284 [inlined]
in _getindex at ./multidimensional.jl:270 [inlined]
in getindex at ./abstractarray.jl:752 [inlined]
in getindex at /Users/ajkeller/.julia/v0.5/AxisArrays/src/indexing.jl:52 [inlined]
in getindex(::AxisArrays.AxisArray{Float64,2,Array{Float64,2},Tuple{AxisArrays.Axis{:test1,FloatRange{Float64}},AxisArrays.Axis{:test2,FloatRange{Float64}}}}, ::AxisArrays.Axis{:test2,Float64}) at /Users/ajkeller/.julia/v0.5/AxisArrays/src/indexing.jl:76
I soon realized of course that dimensional axes are not allowed to be indexed by Real
elements, but it seems like the intended error message is not being delivered.
In my use case it would be really great to be able to index by a real value. (Let's say you're measuring a 2D sweep, where your axis values are probably coming from ranges with nonzero step size, and you want to take a slice at a particular value on that axis to make a plot.) Just a suggestion, but something like the following seems to work:
julia> immutable Value{T}
val::T
end
julia> atvalue(x) = Value(x)
atvalue (generic function with 1 method)
julia> AxisArrays.axisindexes(::Type{AxisArrays.Dimensional}, ax::AbstractVector, idx::Value) = findfirst(ax.==idx.val)
julia> A[Axis{:test1}(atvalue(0.2)), Axis{:test2}(atvalue(0.3))]
0.12769615238125231
As an aside, it seems like indexing-by-value just works out of the box with Unitful numbers, because Unitful.Quantity
is not a subtype of Real
. It'd be nice if there were an option to make indexing by value work with Real
numbers though.
I find the ataxis function very convenient. I'd like the function to be always safe to use:
A = AxisArray(collect(1:11), Axis{:time}(0:0.1:1));
A[-Inf .. 0.5] #works and indices are clamped correctly
A[atindex(-0.2..0.2, 5)] #works but one has to be careful not to set the index too low or high to hit the boundary
A[atindex(-0.2..0.2, 1)] #BoundsError: attempt to access 11-element Array{Int64,1} at index [[-1, 0, 1, 2, 3]]
It would be great if the indexing with atindex would clamp the indices the same way as for A[-Inf .. 0.5]
Also, i'd like to have a function that would be used like this:
A[atrange(-0.2..0.2, 0.4)] # would return the same subarray as A[atindex(-0.2..0.2, 5)]
#now I go with the "naive implementation"
srange(a, width, value; kw...) = atindex(-width..width, AxisArrays.to_index(a, atvalue(value,kw...))[1])
Unfortunately, one ends up inside @generated function to_index(A::AxisArray{T,N,D,Ax}, I...) where {T,N,D,Ax}
that is too meta for me to get a clue what it does, nor fix the issue myself.
Petr
As you mention in JuliaLang/julia#12624, several people have implemented interval types in their packages. I know at least of @dcjones's IntervalTree package. I'm also looking at intervals for a package that I'm working on. Would you be interested in splitting out the interval type into a separate package? It seems like your implementation is useful pretty generally.
Currently AxisArrays do not use the container type information from the parent when working with broadcast. Typically this is fine but in some cases this can raise an exception:
julia> using DataArrays, AxisArrays
julia> a = AxisArray(@data([1, NA, 3]), 'a':'c')
1-dimensional AxisArray{Int64,1,...} with axes:
:row, 'a':1:'c'
And data, a 3-element DataArrays.DataArray{Int64,1}:
1
NA
3
julia> parent(a) .== 1
3-element DataArrays.DataArray{Bool,1}:
true
NA
false
julia> a .== 1
ERROR: MethodError: Cannot `convert` an object of type DataArrays.NAtype to an object of type Bool
This may have arisen from a call to the constructor Bool(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] setindex!(::Array{Bool,1}, ::DataArrays.NAtype, ::Int64) at ./array.jl:549
[2] macro expansion at ./broadcast.jl:180 [inlined]
[3] macro expansion at ./simdloop.jl:73 [inlined]
[4] macro expansion at ./broadcast.jl:174 [inlined]
[5] _broadcast!(::##1#2, ::BitArray{1}, ::Tuple{Tuple{Bool}}, ::Tuple{Tuple{Int64}}, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}, ::Tuple{}, ::Type{Val{0}}, ::CartesianRange{CartesianIndex{1}}) at ./broadcast.jl:162
[6] broadcast_t(::Function, ::Type{Bool}, ::Tuple{Base.OneTo{Int64}}, ::CartesianRange{CartesianIndex{1}}, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}) at ./broadcast.jl:279
[7] broadcast_c at ./broadcast.jl:314 [inlined]
[8] broadcast(::Function, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}) at ./broadcast.jl:434
Not sure what happened after #55 but reductions now do not work anymore.
Two examples
julia> Y = AxisArray(randn(4,4,4,4),:x,:y,:z,:time)
julia> maximum(parent(Y),2)
ERROR: BoundsError: attempt to access (:x,:y,:z,:time)
at index [5]
in getindex(::Tuple{Symbol,Symbol,Symbol,Symbol}, ::Int64) at ./tuple.jl:8
in reaxis(...) at /home/knopp/.julia/v0.5/AxisArrays/src/indexing.jl:42
in macro expansion at ./reducedim.jl:230 [inlined]
in macro expansion at ./simdloop.jl:73 [inlined]
in _mapreducedim!(::Base.#identity, ::Base.#scalarmax, ::Array{Float64,4}, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}) at ./reducedim.jl:229
in mapreducedim!(::Function, ::Function, ::Array{Float64,4}, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}) at ./reducedim.jl:237
in mapreducedim(::Function, ::Function, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:268
in maximum(::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:320
and
julia> Y = AxisArray(randn(4,4,4),:x,:y,:z)
julia> maximum(parent(Y),2)
ERROR: ArgumentError: ordering is not well-defined for arrays
in macro expansion at ./reducedim.jl:230 [inlined]
in macro expansion at ./simdloop.jl:73 [inlined]
in _mapreducedim!(::Base.#identity, ::Base.#scalarmax, ::Array{Float64,3}, ::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}}}}) at ./reducedim.jl:229
in mapreducedim! at ./reducedim.jl:237 [inlined]
in mapreducedim at ./reducedim.jl:268 [inlined]
in maximum at ./reducedim.jl:318 [inlined]
in maximum(::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:320
Interestingly its two different errors. I don't know what the parent
is supposed to do but it was proposed in #56 by @timholy. It seems that parent
is the identity for AxisArrays
Not sure if this has been discussed before (a quick search didn't turn up much) but I'm wondering if it is possible to make AxisArray
abstract (or have an AbstractAxisArray
that AxisArray
inherits from) and defining an interface that a new AxisArray
type must implement to get the indexing behavior of an AxisArray
? I ask because for variables in NetCDF files it seems natural for them to be AxisArray
s since all arrays in NetCDF files have the dimension information included in the file. But they have other information (e.g., attributes) that it is natural for a NCVar
type to include as well. Thus one could have NCVar <: AbstractAxisArray
. This would also allow NCVar
to have the option of being lazily read when indexed instead of being completely in memory (NetCDF files can be large).
Thinking more about high dimensional concatenation and some typical use cases for combining AxisArrays, there may be a few ways to make better use of axis data for "smarter" alternatives to cat
:
merge{T<:AxisArray}(As::T...)
: given AxisArrays with identical axes but disjoint / non-colliding axis values, automatically combine the inputs into a single AxisArray spanning at least the discrete ranges of all of the input AxisArrays. Unlike a basic cat
, dimensional axes would be automatically sorted as needed. Any elements at a newly-created axis coordinate would assume some default value (in Base a comparable multidimensional cat
fills undefined elements with zeros, but maybe this value could be user-definable?).
join{T<:AxisArray}(method::Symbol, As:T...)
: given AxisArrays with identical axes and possibly overlapping axis values, automatically concatenate the inputs on a new, higher dimension/axis. Dimensional axes would be sorted appropriately. Inner/left/right/outer join methods would determine whether all elements are included in the returned AxisArray or just those at axis values found in all/first/last input AxisArrays.
Thoughts? Are these actions generally common enough to merit special functions? Like the cat
work I was going to implement something similar for a downstream package but figure it may be more generally useful?
Here is package test output from 0.7-beta (omitting lots of deprecations):
Test Summary: | Pass | Fail | Error | Total |
---|---|---|---|---|
AxisArrays | 140 | 12 | 14 | 166 |
Core | 1 | 1 | ||
Intervals | 60 | 60 | ||
Indexing | 19 | 5 | 24 | |
SortedVector | 3 | 7 | 10 | |
CategoricalVector | 4 | 4 | 3 | 11 |
Search | 26 | 26 | ||
Combine | 28 | 4 | 32 | |
README | 1 | 1 |
Also, are users now expected to module-qualify all calls of axes
(since Base
exports an incompatible function)?
(Sorry to ping you, @mbauman, but this repo doesn't seem to get much attention and the last question is important for the image-processing ecosystem.)
@mbauman have you given any thought as to how sparse arrays would fit into this framework? I have a project that I've been working on that overlaps with this concept and I would like to help move it here so that people will actually use it :-)
I'm curious why the API has this indexing syntax and why it's different from standard Julia array indexing.
julia> A = AxisArray(reshape(1:60, 12, 5), (.1:.1:1.2, .1:.1:.5));
julia> A[Axis{:col}(2)] == A[:,2]
true
julia> A[Axis{:col}(2)] === A[:,2]
false
It would be great to have the ability to do the following:
Main> a = AxisArray(randn(10,10,5), :x, :y, :time);
Main> timeaxis(a)
AxisArrays.Axis{:time,Base.OneTo{Int64}}(Base.OneTo(5))
Main> for timepoint in timeaxis(a)
println(timepoint)
end
ERROR: MethodError: no method matching start(::AxisArrays.Axis{:time,Base.OneTo{Int64}})
Closest candidates are:
start(::SimpleVector) at essentials.jl:258
start(::Base.MethodList) at reflection.jl:560
start(::ExponentialBackOff) at error.jl:107
...
Stacktrace:
[1] anonymous at ./<missing>:?
[2] eval(::Module, ::Any) at ./boot.jl:235
That would be the most intuitive method for me to iterate over slices along an axis.
My current solution is way uglier and more fragile
for timepoint in 1:size(exp_img)[axisdim(exp_img, Axis{:time}())]
println(timepoint)
end
The last example in the readme:
B = AxisArray(randn(100,100,100), :x, :y, :z)
Itotal = sumz = 0.0
for iter in eachindex(B) # traverses in storage order for cache efficiency
I = B[iter] # intensity in a single voxel
Itotal += I
sumz += I * iter[axisdim(B, Axis{:z})] # axisdim "looks up" the z dimension
end
meanz = sumz/Itotal
doesn't work under Julia 0.6. It gives a bounds error:
ERROR: BoundsError
Stacktrace:
[1] getindex(::Int64, ::Int64) at ./number.jl:38
[2] macro expansion at ./REPL[4]:4 [inlined]
[3] anonymous at ./<missing>:?
I believe this is due to eachindex
returning a single value instead of a tuple. This is likely because it is using fast linear indexing instead of Cartesian indexing. What would the updated version of this code look like? @timholy
Apologies if this is a known issue and/or if the transition to 0.7 is still incomplete, but just in case I thought I should report this.
julia> a = AxisArray(rand(3,3), [:a,:b,:c], [:x,:y,:z])
2-dimensional AxisArray{Float64,2,...} with axes:
:row, Symbol[:a, :b, :c]
:col, Symbol[:x, :y, :z]
And data, a 3×3 Array{Float64,2}:
0.991885 0.000544059 0.511753
0.242214 0.0773739 0.131205
0.42773 0.532756 0.565765
julia> test(a) = a[:a,:x]
test (generic function with 1 method)
Version 0.6.2:
julia> @btime test(a);
53.651 ns (1 allocation: 16 bytes)
Version 0.7.0-DEV.4958:
julia> @btime test(a);
571.501 μs (223 allocations: 25.45 KiB)
In the single use case I have for AxisArrays
right now, it would be very convenient if the default indexing of an AxisArray
used the values of the indices instead of the indices of the indices. E.g. if I do
x = AxisArray(Axis{:x}(-3:3))
x[[-2,1]]
it would be handy if it returned the second and fifth element of the AxisVector
. You can always refer to the indices of the indices by using the Axis
type. Could this work?
On both 0.5 and 0.6:
julia> using AxisArrays
julia> AxisArrays.axistrait(Axis{:myaxis}([1,2,3]))
AxisArrays.Unsupported
julia> AxisArrays.axistrait(Axis{:myaxis}([:A, :B]))
AxisArrays.Unsupported
Right now the methods are:
axistrait(::Any) = Unsupported
axistrait{T<:Union{Number, Dates.AbstractTime}}(::AbstractVector{T}) = Dimensional
axistrait{T<:Union{Symbol, AbstractString}}(::AbstractVector{T}) = Categorical
A working dispatch-based solution should be possible in 0.6, not sure about 0.5? Alternatively this could be switched over to a single method using eltype
?
I'm thinking about mapslices
-like behavior, where I have an algorithm where I might want to pass in a particular axis: foo(A, ax)
where ax
is one of the axes
of A
. I might write foo
something like this:
function foo(A, ax)
for i = 1:length(ax)
Asl = A[ax(i)]
# do something with Asl
end
return blah
end
The key line is that A[ax(i)]
, which doesn't work. I find myself doing this kind of experimentation:
julia> using AxisArrays
julia> A = AxisArray(rand(3,5), :x, :y)
2-dimensional AxisArray{Float64,2,...} with axes:
:x, 1:3
:y, 1:5
And data, a 3×5 Array{Float64,2}:
0.351439 0.92995 0.0618315 0.9546 0.298008
0.721548 0.149403 0.455972 0.999368 0.592674
0.175467 0.123669 0.137399 0.66286 0.828891
julia> ax = axes(A)[2]
AxisArrays.Axis{:y,UnitRange{Int64}}(1:5)
julia> A[ax[2]]
0.7215482478540596
julia> ax(2)
ERROR: MethodError: objects of type AxisArrays.Axis{:y,UnitRange{Int64}} are not callable
julia> A[Axis{:y}(2)]
1-dimensional AxisArray{Float64,1,...} with axes:
:x, 1:3
And data, a 3-element SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},true}:
0.92995
0.149403
0.123669
julia> axisnames(ax)
(:y,)
The last one works, but you seemingly have to know the name
of ax
to make it work. If you've passed ax
as an argument, there isn't a way to know this short of dispatching on name
in Axis{name}
. EDIT: of course, one is likely to write foo
with signature foo(A, ax::Axis)
, so maybe it's not so bad to use dispatch. It still feels like ax
objects should be indexable-for-indexing, though.
Am I missing a simple way to do this? If not, I'd volunteer to implement ax(2)
if you think that's reasonable.
Currently:
julia> input = AxisArray(collect(reshape(1:60, 12, 5)), .1:.1:1.2, [:a, :b, :c, :d, :e])
2-dimensional AxisArray{Int64,2,...} with axes:
:row, 0.1:0.1:1.2
:col, Symbol[:a, :b, :c, :d, :e]
And data, a 12×5 Array{Int64,2}:
1 13 25 37 49
2 14 26 38 50
3 15 27 39 51
4 16 28 40 52
5 17 29 41 53
6 18 30 42 54
7 19 31 43 55
8 20 32 44 56
9 21 33 45 57
10 22 34 46 58
11 23 35 47 59
12 24 36 48 60
julia> input[find(isfinite.(input))]
1-dimensional AxisArray{Int64,1,...} with axes:
:col, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 51, 52, 53, 54, 55, 56, 57, 58, 59, 60]
And data, a 60-element Array{Int64,1}:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
⋮
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
julia> input[isfinite.(input)]
ERROR: BoundsError: attempt to access 12×5 Array{Int64,2} at index [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], Base.Slice(Base.OneTo(5))]
Stacktrace:
[1] throw_boundserror(::Array{Int64,2}, ::Tuple{Array{Int64,1},Base.Slice{Base.OneTo{Int64}}}) at ./abstractarray.jl:433
[2] checkbounds at ./abstractarray.jl:362 [inlined]
[3] macro expansion at ./multidimensional.jl:441 [inlined]
[4] _getindex at ./multidimensional.jl:438 [inlined]
[5] getindex at ./abstractarray.jl:882 [inlined]
[6] getindex at /Users/ericdavies/.julia/v0.6/AxisArrays/src/indexing.jl:88 [inlined]
[7] getindex(::AxisArrays.AxisArray{Int64,2,Array{Int64,2},Tuple{AxisArrays.Axis{:row,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}},AxisArrays.Axis{:col,Array{Symbol,1}}}}, ::BitArray{2}) at /Users/ericdavies/.julia/v0.6/AxisArrays/src/indexing.jl:99
IMO these should have identical results. I'll work on this unless there are any concerns.
Hello, first of all: nice package! It's doing what I was trying to build in my own package. Now, I'm more into integrating this package into my own. :)
I do have one issue so far though: I'm building some AxisArrays with a time, longitude and latitude axis. Using the following syntax, I can't access the last element of the time dimension using end
.
edit - using:
Julia 0.5
AxisArrays 0.0.4
julia> typeof(test)
AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}
julia> T = test[AxisArrays.Axis{:time}]
AxisArrays.Axis{:time,Array{Date,1}}(Date[1950-01-01,1950-01-02,1950-01-03,1950-01-04,1950-01-05,1950-01-06,1950-01-07,1950-01-08,1950-01-09,1950-01-10 … 2100-12-22,2100-12-23,2100-12-24,2100-12-25,2100-12-26,2100-12-27,2100-12-28,2100-12-29,2100-12-30,2100-12-31])
julia> typeof(T)
AxisArrays.Axis{:time,Array{Date,1}}
julia> T[1]
1950-01-01
julia> T[end]
MethodError: no method matching endof(::AxisArrays.Axis{:time,Array{Date,1}})
Closest candidates are:
endof(::SimpleVector) at essentials.jl:169
endof(::Char) at char.jl:18
endof(::String) at strings/string.jl:37
...
julia> T[55115]
2100-12-31
Another example:
julia> test[Axis{:time}(1)]
2-dimensional AxisArray{Float64,2,...} with axes:
:lon, [0.0,2.8125,5.625,8.4375,11.25,14.0625,16.875,19.6875,22.5,25.3125 … 331.875,334.688,337.5,340.313,343.125,345.938,348.75,351.563,354.375,357.188]
:lat, [-87.8638,-85.0965,-82.3129,-79.5256,-76.7369,-73.9475,-71.1578,-68.3678,-65.5776,-62.7874 … 62.7874,65.5776,68.3678,71.1578,73.9475,76.7369,79.5256,82.3129,85.0965,87.8638]
And data, a 128×64 Array{Float64,2}:
-13.1781 -9.32368 -7.82642 -8.93107 -9.87894 … -1.83304 -4.26288 -10.9854 -22.7088 -32.7322 -37.4845
-13.1757 -9.49341 -8.18998 -9.56293 -10.5652 -2.10532 -3.36542 -8.54114 -20.7078 -31.7471 -37.1273
-13.2207 -9.75245 -8.60462 -10.1924 -11.126 -2.91254 -3.39817 -7.39728 -18.8907 -30.91 -36.9068
-13.4053 -10.0192 -9.02738 -10.6354 -11.8759 -4.08307 -4.07075 -6.77033 -17.479 -30.0259 -36.6732
-13.533 -10.9317 -9.48826 -11.1507 -12.4858 -5.39445 -5.13422 -7.10557 -16.6137 -29.0839 -36.4522
-13.6584 -10.7656 -9.93928 -11.7239 -12.8303 … -6.65769 -7.51227 -8.27405 -16.1455 -28.2062 -36.322
-13.6874 -10.9521 -10.4106 -12.1195 -13.2295 -8.16553 -10.9555 -8.8057 -15.9809 -27.546 -36.2891
-13.7967 -11.1882 -10.8646 -12.5337 -13.5852 -9.87189 -12.7761 -9.81238 -16.2154 -27.0488 -36.325
-13.8111 -11.4431 -11.3259 -13.0253 -13.9069 -11.5795 -15.156 -11.1282 -16.5414 -26.7977 -36.3877
-13.9304 -11.7004 -11.7421 -13.4728 -14.2188 -13.0959 -16.7264 -15.3921 -16.9609 -26.8751 -36.26
-14.0054 -11.9974 -12.1023 -13.8859 -14.5353 … -15.4631 -20.6427 -17.7955 -17.3621 -27.1115 -36.0593
-14.0489 -12.3577 -12.4554 -14.2304 -14.8702 -16.9788 -23.8476 -19.3232 -18.0571 -27.1679 -35.8542
-14.1056 -12.4714 -12.8453 -14.4906 -15.1919 -20.4497 -26.3586 -20.5891 -18.5888 -27.1167 -35.6224
-14.2416 -12.7158 -13.2491 -14.738 -15.3265 -22.8977 -28.4267 -21.5647 -19.1263 -27.1368 -35.4454
-14.1789 -12.9751 -13.5664 -14.798 -15.3577 -24.4087 -29.7131 -22.5045 -19.7391 -27.3171 -35.3752
-14.3764 -13.238 -13.8794 -14.9066 -15.2137 … -25.6306 -30.6848 -23.4341 -20.2359 -27.5961 -35.4666
⋮ ⋱ ⋮
-12.6153 -7.49509 -4.35917 -3.55839 -2.69089 -30.2057 -30.5568 -32.3227 -37.9244 -38.7628 -38.3909
-12.5765 -7.47901 -4.42795 -3.67417 -3.07596 -31.7261 -34.4672 -33.0697 -38.7802 -39.4308 -38.6722
-12.5693 -7.47721 -4.52579 -3.77119 -2.93836 -33.7065 -35.7553 -33.3518 -39.5584 -40.0128 -38.9905
-12.6683 -7.48649 -4.58289 -3.85673 -2.86304 … -35.142 -35.927 -33.8506 -40.2003 -40.5507 -39.2897
-12.7379 -7.51856 -4.69724 -3.96305 -2.78657 -32.0892 -37.2641 -34.1958 -40.5407 -41.0779 -39.5395
-12.6542 -7.57889 -4.86066 -4.07657 -2.7201 -32.0626 -36.9486 -34.3959 -40.6646 -41.5107 -39.7349
-12.643 -7.64567 -4.9675 -4.24183 -2.78983 -27.3247 -35.6354 -34.0476 -40.3867 -41.3766 -39.8855
-12.7144 -7.73249 -5.1489 -4.43427 -2.97227 -25.3403 -33.3752 -33.4608 -39.6832 -40.7994 -40.0125
-12.7472 -7.83204 -5.366 -4.72288 -3.25089 … -21.6124 -29.8843 -33.0288 -39.0629 -40.2828 -39.9988
-12.7618 -7.94227 -5.5405 -5.03351 -3.53205 -16.5938 -23.5874 -31.3431 -37.6958 -39.6077 -39.7354
-12.772 -8.08826 -5.77714 -5.37885 -4.11234 -13.8241 -20.7749 -26.1037 -36.3062 -38.7978 -39.4562
-12.8576 -8.25623 -6.13535 -5.85761 -4.90565 -10.706 -18.2744 -24.2676 -33.6903 -37.878 -39.1897
-12.9316 -8.43967 -6.46885 -6.31055 -5.83805 -8.03733 -15.184 -22.3596 -30.9707 -36.882 -39.0329
-12.9815 -8.64146 -6.78547 -6.95096 -6.82795 … -4.94059 -11.9601 -19.3953 -29.2095 -35.781 -38.3263
-13.0144 -8.85602 -7.04481 -7.52024 -7.87766 -3.31864 -9.8346 -16.7155 -27.2396 -34.8338 -38.1462
-13.0999 -9.158 -7.44782 -8.31867 -8.89557 -2.30311 -6.4213 -13.8705 -24.8114 -33.7981 -37.8232
julia> test[Axis{:time}(end)]
------ BoundsError --------------------- Stacktrace (most recent call last)
[1] — getindex(::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}, ::AxisArrays.Axis{:time,Int64}) at indexing.jl:84
[2] — getindex at indexing.jl:60 [inlined]
[3] — getindex at abstractarray.jl:752 [inlined]
[4] — _getindex at multidimensional.jl:270 [inlined]
[5] — checkbounds at abstractarray.jl:284 [inlined]
[6] — throw_boundserror(::Array{Float64,3}, ::Tuple{Int64,Colon,Colon}) at abstractarray.jl:355
BoundsError: attempt to access 55115×128×64 Array{Float64,3} at index [451502080,Colon(),Colon()]
Indexing: Error During Test
Test threw an exception of type ErrorException
Expression: @inferred(D[1, 1, 1, :]) == @inferred(D[1, 1, 1, 1:1]) == @inferred(D[1, 1, 1, [1]]) == AxisArray([10], Axis{:dim_4}(Base.OneTo(1)))
return type AxisArrays.AxisArray{Int64,1,Array{Int64,1},Tuple{AxisArrays.Axis{:dim_4,Base.OneTo{Int64}}}} does not match inferred return type AxisArrays.AxisArray{Int64,1,Array{Int64,1},_} where _
There are others but this one is the simplest (found by running AxisArrays master tests on JuliaLang master).
We can't restrict the type accepted to Tuple{Vararg{Type{<:Axis}}}
because typeof((Axis{:time}, Axis{:label})) == Tuple{UnionAll, UnionAll}
but we might be able to assume they are axis types when they're not Base.Dims
.
This would be especially useful after JuliaLang/julia#23500
It looks like AA redefines cat
on basic Vector types:
julia> using AxisArrays
julia> ll = Array{Array{Symbol,1},1}()
0-element Array{Array{Symbol,1},1}
julia> cat(ll..., dims=1)
ERROR: BoundsError: attempt to access ()
at index [1]
Stacktrace:
[1] getindex(::Tuple{}, ::Int64) at ./tuple.jl:24
[2] _cat at /home/pablo/.julia/packages/AxisArrays/G6pZY/src/combine.jl:20 [inlined]
[3] #cat#44(::Int64, ::Function) at /home/pablo/.julia/packages/AxisArrays/G6pZY/src/combine.jl:16
[4] (::getfield(Base, Symbol("#kw##cat")))(::NamedTuple{(:dims,),Tuple{Int64}}, ::typeof(cat)) at ./none:0
[5] top-level scope at none:0
Having looked at the source code it is clear that when taking a view of an AxisArray, the indexed axes might be copied if _new_axes
is called (https://github.com/JuliaArrays/AxisArrays.jl/blob/master/src/indexing.jl#L80). Is this the intended behaviour? It would make more sense to me for views to be taken of the original axes as well, to aid performance.
Example below (there is 16KB of allocation - i.e. the index copy)
using Dates
using AxisArrays
using BenchmarkTools
index = collect(DateTime(2018, 1, 1):Minute(1):DateTime(2018, 1, 10))
B = AxisArray(randn(Float64, (length(index), 10)), index)
@btime view(B, 1000:3000, :)
# 994.800 ns (6 allocations: 15.98 KiB)
If this should be fixed / changed, I'm happy to give a PR a go.
Linear indexing always produces one axis named :col
, whereas the first default index name is actually :row
.
julia> using Unitful, AxisArrays
julia> C = AxisArray(collect(reshape(1:15,3,5)), Axis{:y}([2.0,3.0,4.5]u"m"), Axis{:x}([2.0,2.1,2.3,2.5,2.6]u"cm"))
2-dimensional AxisArray{Int64,2,...} with axes:
:y, Quantity{Float64, Dimensions:{𝐋}, Units:{m}}[2.0 m, 3.0 m, 4.5 m]
:x, Quantity{Float64, Dimensions:{𝐋}, Units:{cm}}[2.0 cm, 2.1 cm, 2.3 cm, 2.5 cm, 2.6 cm]
And data, a 3×5 Array{Int64,2}:
1 4 7 10 13
2 5 8 11 14
3 6 9 12 15
julia> sum(C, 1)
ERROR: DimensionError: m and 1 are not dimensionally compatible.
Stacktrace:
[1] copy!(::IndexLinear, ::Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}, ::IndexLinear, ::Base.OneTo{Int64}) at ./abstractarray.jl:655
[2] reduced_axis at /Users/ajkeller/.julia/v0.6/AxisArrays/src/core.jl:345 [inlined]
[3] (::AxisArrays.##5#6{Tuple{Int64}})(::AxisArrays.Axis{:y,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}}, ::Int64) at /Users/ajkeller/.julia/v0.6/AxisArrays/src/core.jl:306
[4] mapreducedim at ./reducedim.jl:241 [inlined]
[5] sum at ./reducedim.jl:570 [inlined]
[6] sum(::AxisArrays.AxisArray{Int64,2,Array{Int64,2},Tuple{AxisArrays.Axis{:y,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}},AxisArrays.Axis{:x,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{cm}},1}}}}, ::Int64) at ./reducedim.jl:572
Would be fixed by redefining reduced_axis
and reduced_axis0
in src/core.jl
:
reduced_axis( ax) = ax(Base.OneTo(1))
reduced_axis0(ax) = ax(length(ax.val) == 0 ? Base.OneTo(0) : Base.OneTo(1))
However, this change would be at the expense of inferrability in certain cases, resulting in some regressions.
I noticed this issue when looking over #112. That PR improves the situation for other types but does not fix this issue, because Unitful.Quantity <: Number
. One possible fix based on that PR would be to have an inferrable reduced_axis
for Union{Real,Complex}
rather than Number
but I'm not sure what the broader consequences would be. It seems like a lot of Number
subtypes defined in packages are actually <: Real
(e.g. Measurements.Measurement
, ForwardDiff.Dual
) so maybe this would be acceptable.
Hello!
I'm wondering if there is a way to easily keep one of the axis as a "sort" axis constraint in merge operations. Here's an example of what is the current behavior:
A = AxisArray(collect(1:11), Axis{:time}(0:0.1:1))
B = AxisArray(collect(12:22), Axis{:time}(1.1:0.1:2.1))
M = merge(B, A) # ideally, I would like to force the time axis to an ascending order
1-dimensional AxisArray{Int64,1,...} with axes:
:time, [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0 … 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
And data, a 22-element Array{Int64,1}:
12
13
14
15
16
17
18
19
20
21
⋮
3
4
5
6
7
8
9
10
11
Now, trying sort(M)
returns
sort(M)
1-dimensional AxisArray{Int64,1,...} with axes:
:time, [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0 … 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
And data, a 22-element Array{Int64,1}:
1
2
3
4
5
6
7
8
9
10
⋮
14
15
16
17
18
19
20
21
22
Hence, the values are sorted, but not the time axis definition.
Ideally, I would like to enforce the :time
axis to be the sorting constraint in merge
function.
I coded a workaround in a custom merge function which consist of reverting the order of the argument (i.e. merge(B, A) -> merge(A, B)
). But I suspect that it only works for 2-input call to merge without a more complex workaround.
Thanks for any help or hint on how it can be done!
I'm not sure what to do here, but this is a problem:
julia> A = AxisArray(1:10)
10-element AxisArrays.AxisArray{Int64,1,UnitRange{Int64},(AxisArrays.Axis{:row,UnitRange{Int64}},)}:
1
2
3
4
5
6
7
8
9
10
julia> A[[3,2,1]]
ERROR: Dimensional axes must be monotonically increasing
in checkaxis at /Users/mbauman/.julia/v0.4/AxisArrays/src/core.jl:290
in anonymous at /Users/mbauman/.julia/v0.4/AxisArrays/src/core.jl:155
in anonymous at /Users/mbauman/.julia/v0.4/AxisArrays/src/indexing.jl:51
I've been thinking about building a range-based n-dimensional time-series data structure (similar in functionality to TimeSeries.jl, with the first axis dimensional and any others categorical) on top of AxisArrays.jl - TimeType
s like Date
, DateTime
, and ZonedDateTime
don't have a defined zero value, however (see JuliaLang/julia#10450), which prevents them from being used as dimensional variables (because of the current implementation of the monitonically-increasing / positive step check).
Would any of these solutions make sense?
TimeType
sRange{T}
with T <: TimeType
TimeType
sWhile I'm still getting my bearings with the package, I'd be happy to either contribute here or add extensions via the downstream package as appropriate.
Is there any interest in having a convenience function for setting units for an existing Axis
or AxisArray
that is currently unitless? I found this useful when loading someone else's NRRD
image using NRRD.jl. With this format it seems common to save images with meaningful spacings but without units, and units are basically communicated by word of mouth since the format doesn't standardize them anyway. I've been manually re-creating the axes after I load the images and am looking for something more convenient. Am I missing something in the API? What if we overload arithmetic like this?
using AxisArrays, Unitful
ax = Axis{:x}(linspace(0,10,11))
ax *= Unitful.μm
The array underlying the AxisArray of an AxisArray is an AxisArray:
julia> a = AxisArray(zeros(2, 4), Axis{:a}((3, 4)))
2-dimensional AxisArray{Float64,2,...} with axes:
:a, (3, 4)
:col, Base.OneTo(4)
And data, a 2x4 Array{Float64,2}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
julia> b = AxisArray(a)
2-dimensional AxisArray{Float64,2,...} with axes:
:row, Base.OneTo(2)
:col, Base.OneTo(4)
And data, a 2-dimensional AxisArray{Float64,2,...} with axes:
:a, (3, 4)
:col, Base.OneTo(4)
And data, a 2x4 Array{Float64,2}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
Array, however, is a no-op:
julia> a = Array{Int64}((1, 2))
1x2 Array{Int64,2}:
4520822288 4520821504
julia> Array{Int64}(a) === a
true
I'm not sure what the default behaviour should be:
default_axes
. But that behaviour might all too special.I'm happy to do a PR for 2 or 3
a=AxisArray(rand(3,3,3,10),:x,:y,:z,:time);
b=AxisArray(rand(3,3,3,20),:x,:y,:z,:time);
c=cat(4,a,b)
does not work, am I missing something? Is there something like c=cat(Axis{:time},a,b)
?
c=cat(4,a.data,b.data)
works
With named columns, indexing based on a symbol slows down tight loops. DataFrames has the same problem. It's tough to get fast, type-stable code for this. Here is an example:
using AxisArrays
N = 10_000_000
A = AxisArray(rand(10000000,2), (1:N,[:a,:b]));
function f(x, j)
res = 0.0
for i in 1:size(x,1)
res += x[i,j]
end
res
end
@time f(A, 2) # elapsed time: 0.028395385 seconds (117 kB allocated)
@time f(A, :b) # elapsed time: 0.147722792 seconds (364 kB allocated)
I tried using types as column names to speed things up, but the resulting code isn't type stable, and it's even slower. Here's my attempt:
## Bunch of kludgy stuff
AxisArrays.checkaxis{T}(::Type{Val{T}}) = true
stagedfunction AxisArrays.axisindexes{T,S}(::Type{Val{T}}, ::Type{Val{S}})
i = findfirst(T, S)
:($i)
end
Base.length{T}(::Type{Val{T}}) = length(T)
Base.size{T}(::Type{Val{T}}) = size(T)
Base.size{T}(::Type{Val{T}},i) = size(T,i)
B = AxisArray{Float64,2,Array{Float64,2},(:row,:col),(UnitRange{Int64},DataType)}(rand(N,2), (1:N,Val{(:a,:b)}));
@time f(B, Val{:b}) # elapsed time: 0.982083383 seconds (461 MB allocated, 3.26% gc time in 20 pauses with 0 full sweep)
@time f(B, 2) # elapsed time: 0.019993164 seconds (88 kB allocated)
Hi,
I am wondering if there is a way to change axes.
For example,
using Unitful
using AxisArrays
const μm = u"μm"
const s = u"s"
tmp = AxisArray(rand(1000,1000,2,2), (:x, :y, :z, :t), (0.577μm, 0.577μm, 5μm, 2s))
I would like to change 0.577μm of :x
to some other values. Is there easy with to change it (without generating a new array)?
I tried several things. All didn't work.
# All below do not work.
tmp[Axis{:x}] = 1:1000
tmp = AxisArray(tmp, (:x), (1))
tmp = AxisArray(tmp, (:x, :y, :z, :t), (1, 0.577μm, 5μm, 2s)) # This generates additional axis.
Best,
I'm really liking how the API is coming together. Normal indexing works like a charm, and it makes a lot of sense that the special A[Axis{:time}(idxs)]
indexing form would preserve all the other (unnamed) dimensions. But what about when you index by axis values with missing trailing dimensions? A good example here would be the following:
A = AxisArray(reshape(1:15, 5,3), (.1:.1:0.5, [:a, :b, :c]))
A[Interval(.22, .44)]
As it is currently designed, this will lower to a call to A.data[3:4]
, which is linearly indexing across both dimensions. As such, you'll only get back two elements, when you probably wanted those two rows from all three columns (A.data[3:4, :]
).
I think there's a case to be made here to preserve the omitted dimensions. Since we're indexing by axis value, we cannot select any indices that are out-of-bounds for that dimension — we'll only ever get values from the first column. That doesn't seem very useful to me. Any objections?
julia> using MarketData
julia> Apple = AxisArray(ohlc[1:12].values, (ohlc[1:12].timestamp, colnames(ohlc)))
12x4 AxisArrays.AxisArray{Float64,2,Array{Float64,2},(:row,:col),(Array{Base.Dates.Date,1},Array{UTF8String,1})}:
104.88 112.5 101.69 111.94
108.25 110.62 101.19 102.5
103.75 110.56 103.0 104.0
106.12 107.0 95.0 95.0
96.5 101.0 95.5 99.5
102.0 102.25 94.75 97.75
95.94 99.38 90.5 92.75
95.0 95.5 86.5 87.19
94.48 98.75 92.5 96.75
100.0 102.25 99.38 100.44
101.0 106.0 100.44 103.94
105.62 108.75 103.38 106.56
julia> Apple.axes[1]
12-element Array{Base.Dates.Date,1}:
2000-01-03
2000-01-04
2000-01-05
2000-01-06
2000-01-07
2000-01-10
2000-01-11
2000-01-12
2000-01-13
2000-01-14
2000-01-18
2000-01-19
julia> Apple.axes[2]
4-element Array{UTF8String,1}:
"Open"
"High"
"Low"
"Close"
I'd love to make use of this package in the upcoming Images rewrite, and for that to work nicely we'll need to register & tag it. Is there a list of things that need to be addressed before that happens? It will be some time (week or two?) before the need is urgent, so there's time to address issues. I'm happy to do the work myself.
Also, if @mbauman would like to ditch implicit responsibility for the package, perhaps we should consider moving this to JuliaArrays. I just moved a bunch of my packages there.
As discussed in #25 and illustrated in #64, it's not clear whether concatenating arbitrary AxisArrays together makes sense. As far as I can tell merge
and join
provide the functionality that you would want from cat
, provided your axes have meaningful names and values (and if they don't, why are you using an AxisArray?).
I'd be interested to know if:
cat
merge
/join
can't do.For some historical context, I implemented cat
first and realised soon after that I needed something a bit smarter - that's when I added merge
/join
. Had I had a better understanding of my needs I likely wouldn't have added it to begin with. Now it just seems like a source of potential confusion and frustration: the same axis constraints that govern the other functions apply to cat
as well, it just isn't smart enough to do the right thing when those constraints would be violated, and throws an error instead.
If cat
is indeed redundant, it might make sense to either:
cat
methodmerge
or join
, depending on the arguments (e.g. does the user supply a new axis for join
ing on)?One issue with the second option is that cat
implies a lower-level control over dimensional ordering that merge
and join
abstract away, so users may expect it to do things it won't. Of course, this is true of the current cat
method as well.
julia> AxisArray(randn(3,3), (:x,:y),(0.1,0.1),(-0.1,-0.2))
2-dimensional AxisArray{Float64,2,...} with axes:
:x, -0.1:0.1:0.1
:y, -0.2:0.1:0.0
And data, a 3×3 Array{Float64,2}:
0.763468 0.587158 0.520983
0.0610593 0.801456 0.721546
0.755276 0.834483 0.0795272
julia> maximum(B,2)
3×1 Array{Float64,2}:
0.763468
0.801456
0.834483
julia> B[:,2]
1-dimensional AxisArray{Float64,1,...} with axes:
:x, -0.1:0.1:0.1
And data, a 3-element Array{Float64,1}:
0.587158
0.801456
0.834483
I would have expected that maximum
also returns an AxisArray. This was the case with the old Image
object.
ping @timholy
Hello, I'm trying the following code (which simply calculate the number of element that are over a threshold 1
). Probably a problem on my side (e.g. not using the correct command for what I'm trying to do).
Code
julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(A.data, idx, :, :))
Some Outputs
julia> typeof(A.data)
AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}
julia> typeof(FD)
Array{Int64,3}
julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(A.data, 1:365, :, :))
------ MethodError --------------------- Stacktrace (most recent call last)
[1] — mapreducedim!(::Function, ::Function, ::SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}, ::AxisArrays.AxisArray{Float64,3,SubArray{Float64,3,Array{Float64,3},Tuple{UnitRange{Int64},Colon,Colon},false},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}) at reducedim.jl:222
[2] — _mapreducedim!(::##39#40, ::Base.#+, ::SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}, ::AxisArrays.AxisArray{Float64,3,SubArray{Float64,3,Array{Float64,3},Tuple{UnitRange{Int64},Colon,Colon},false},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}) at reducedim.jl:206
[3] — macro expansion at simdloop.jl:73 [inlined]
[4] — macro expansion at reducedim.jl:207 [inlined]
MethodError: no method matching isless(::AxisArrays.AxisArray{Float64,1,Array{Float64,1},Tuple{AxisArrays.Axis{:dim_4,Base.OneTo{Int64}}}}, ::Int64)
Closest candidates are:
isless(::Char, ::Integer) at deprecated.jl:49
isless(::AbstractFloat, ::Real) at operators.jl:42
isless(::Real, ::Real) at operators.jl:75
...
Using a regular Array{Float64,3}
, the command works:
julia> R = randn(20440,128,64)
julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(R, 1:365, :, :))
1×128×64 SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}:
[:, :, 1] =
58 59 61 67 63 46 64 54 52 54 50 51 59 55 73 53 … 55 56 51 56 64 62 68 46 61 66 57 55 61 56 48
...
[:, :, 64] =
66 48 55 49 70 67 56 59 53 70 40 52 60 49 52 59 … 45 64 59 56 59 63 71 72 67 58 48 46 65 67 48
julia> typeof(R)
Array{Float64,3}
Now that the axis information is stored as proper Axis types, one possible way to both allow an unchecked constructor and fix #14 would be to move the sort check into the Axis constructor. We could even go further, and make Axis an abstract super type of all these traits, with the Axis constructor returning the appropriate type. This could include, for example, a SortedDimensionalAxis type.
This would, of course, make the abstract Axis constructor type-unstable. It also moves more of the axis trait complexity into the foreground.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.