Code Monkey home page Code Monkey logo

Comments (9)

oschulz avatar oschulz commented on May 16, 2024 1

That would be awesome! Let me know if you need any improvements in ArraysOfArrays.jl to support this.

from unroot.jl.

Moelf avatar Moelf commented on May 16, 2024 1

Julia's compiler is too good... each should be faster than the previous but as you can see, it doesn't make much difference

julia> @btime t.Muon_pt[1:10^6];
  128.243 ms (49947 allocations: 163.46 MiB)

julia> @btime vcat([UnROOT.basketarray(t.Muon_pt, i) for i in 1:450]...);
  111.932 ms (50418 allocations: 150.14 MiB)

julia> @btime mapreduce(i->UnROOT.basketarray(t.Muon_pt, i), append!, 1:450);
  112.515 ms (49971 allocations: 139.77 MiB)

from unroot.jl.

Moelf avatar Moelf commented on May 16, 2024 1

yeah I said:

each should be faster than the previous

VoV should be the fastest, but in reality, it doesn't make much difference (for that part of the operation)

(of course we have other motivation to make slicing return a VoV)

from unroot.jl.

aminnj avatar aminnj commented on May 16, 2024

After the getindex optimization, I see almost identical speeds for the naive way vs one based on basketarray:

julia> @btime t.Muon_pt[1:10^6]
  192.154 ms (51297 allocations: 156.56 MiB)
1000000-element Vector{SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}}:
 [10.763697, 15.736523]
 # ...

julia> ibasket = findfirst(>=(10^6),t.Muon_pt.b.fBasketEntry)-1
450

julia> @btime vcat([UnROOT.basketarray(t.Muon_pt, i) for i in 1:ibasket]...)
  185.263 ms (51771 allocations: 142.18 MiB)
1643390-element ArraysOfArrays.VectorOfVectors{Float32, Vector{Float32}, Vector{Int32}, Vector{Tuple{}}}:
 Float32[10.763697, 15.736523]
 # ...

Note that I'm ignoring the "chopping" logic to go from full baskets to the actual unitrange. From a speed perspective, I don't see a point unless we want to have VoV downstream...

from unroot.jl.

Moelf avatar Moelf commented on May 16, 2024

and crazy enough, the performance is also similar

julia> const b = mapreduce(i->UnROOT.basketarray(t.Muon_pt, i), append!, 1:450);

julia> length(b)
1643390

julia> const a = t.Muon_pt[1:1643390];

julia> sum(sum, a) == sum(sum,b)
true

julia> @benchmark sum(sum, a)
BenchmarkTools.Trial: 320 samples with 1 evaluation.
 Range (min  max):  15.523 ms  15.792 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     15.632 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.630 ms ± 30.438 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                            ▁▁  ▃▃▄▄▇█▇▆▄▄▄ ▂  ▁               
  ▃▁▁▁▁▁▃▁▁▁▃▁▁▃▃▃▃▄▅▆▆▃█▄█▇███▅█████████████▄██▅▅▃▅▃▃▃▄▃▁▃▃▄ ▄
  15.5 ms         Histogram: frequency by time        15.7 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark sum(sum, b)
BenchmarkTools.Trial: 304 samples with 1 evaluation.
 Range (min  max):  16.274 ms  16.760 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     16.489 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   16.480 ms ± 51.774 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                           ▃▃▂ ▄█▂▂            
  ▂▁▁▁▂▂▂▂▃▃▂▂▁▂▂▁▃▃▂▁▂▁▁▂▁▁▁▂▁▁▂▂▂▃▄▃▅▅▆▅█████████▆▅▄▅▄▄▃▅▄▃ ▃
  16.3 ms         Histogram: frequency by time        16.6 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

except for length:

julia> @benchmark sum(length, a)
BenchmarkTools.Trial: 1470 samples with 1 evaluation.
 Range (min  max):  3.290 ms   4.321 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     3.384 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.389 ms ± 54.861 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▁▂▂▄▃▄▆▇██▇▂▂                                    
  ▃▂▃▂▂▄▃▃▆▅▇█████████████▆▅▄▃▄▃▃▃▂▂▃▂▃▃▂▁▂▃▂▂▂▂▂▂▂▂▂▁▂▁▂▁▂▂ ▄
  3.29 ms        Histogram: frequency by time        3.59 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark sum(length, b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  330.388 μs  678.389 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     342.481 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   343.999 μs ±  10.849 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                      █▁ ▁▆▃                                     
  ▂▁▁▁▁▂▁▂▂▁▂▂▂▂▂▄▆▄▅███▇██████▇▆▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  330 μs           Histogram: frequency by time          360 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.'

from unroot.jl.

oschulz avatar oschulz commented on May 16, 2024

But with VectorOfVectors, you could do flatview and it would be O(1) :-)

from unroot.jl.

Moelf avatar Moelf commented on May 16, 2024

sum is O(1)? I guess you mean length. At any rate, if the jaggness doesn't matter (i.e. event level doesn't matter), flatview helps a lot, so it's still beneficial.

from unroot.jl.

oschulz avatar oschulz commented on May 16, 2024

sum is O(1)

Oh, no! I meant the append!-example above. :-)

from unroot.jl.

oschulz avatar oschulz commented on May 16, 2024

A structure like VectorOfVectors is also helpful if you want to append whole "jagged" column/branch content, because it will append the flat vectors underneath very efficiently.

from unroot.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.