Code Monkey home page Code Monkey logo

gflops.jl's Introduction

GFlops.jl

Lifecycle Build Status Coverage

When code performance is an issue, it is sometimes useful to get absolute performance measurements in order to objectivise what is "slow" or "fast". GFlops.jl leverages the power of Cassette.jl to automatically count the number of floating-point operations in a piece of code. When combined with the accuracy of BenchmarkTools, this allows for easy and absolute performance measurements.

Installation

This package is registered and can therefore be simply be installed with

pkg> add GFlops

Example use

This simple example shows how to track the number of operations in a vector summation:

julia> using GFlops

julia> x = rand(1000);

julia> @count_ops sum($x)
Flop Counter: 999 flop
┌─────┬─────────┐
│     │ Float64 │
├─────┼─────────┤
│ add │     999 │
└─────┴─────────┘

julia> @gflops sum($x);
  8.86 GFlops,  12.76% peak  (9.99e+02 flop, 1.13e-07 s, 0 alloc: 0 bytes)

GFlops.jl internally tracks several types of Floating-Point operations, for both 32-bit and 64-bit operands. Pretty-printing a Flop Counter only shows non-zero entries, but any individual counter can be accessed:

julia> function mixed_dot(x, y)
           acc = 0.0
           @inbounds @simd for i in eachindex(x, y)
               acc += x[i] * y[i]
           end
           acc
       end
mixed_dot (generic function with 1 method)

julia> x = rand(Float32, 1000); y = rand(Float32, 1000);

julia> cnt = @count_ops mixed_dot($x, $y)
Flop Counter: 1000 flop
┌─────┬─────────┬─────────┐
│     │ Float32 │ Float64 │
├─────┼─────────┼─────────┤
│ add │       01000 │
│ mul │    10000 │
└─────┴─────────┴─────────┘

julia> fieldnames(GFlops.Counter)
(:fma32, :fma64, :muladd32, :muladd64, :add32, :add64, :sub32, ...)

julia> cnt.add64
1000

julia> @gflops mixed_dot($x, $y);
  9.91 GFlops,  13.36% peak  (2.00e+03 flop, 2.02e-07 s, 0 alloc: 0 bytes)

Caveats

Fused Multiplication and Addition: FMA & MulAdd

On systems which support them, FMAs and MulAdds compute two operations (an addition and a multiplication) in one instruction. @count_ops counts each individual FMA/MulAdd as one operation, which makes it easier to interpret counters. However, @gflops will count two floating-point operations for each FMA, in accordance to the way high-performance benchmarks usually behave:

julia> x = 0.5; coeffs = rand(10);

# 9 MulAdds but 18 flop
julia> cnt = @count_ops evalpoly($x, $coeffs)
Flop Counter: 18 flop
┌────────┬─────────┐
│        │ Float64 │
├────────┼─────────┤
│ muladd │       9 │
└────────┴─────────┘

julia> @gflops evalpoly($x, $coeffs);
  0.87 GFlops,  1.63% peak  (1.80e+01 flop, 2.06e-08 s, 0 alloc: 0 bytes)

Non-julia code

GFlops.jl does not see what happens outside the realm of Julia code. It especially does not see operations performed in external libraries such as BLAS calls:

julia> using LinearAlgebra

julia> @count_ops dot($x, $y)
Flop Counter: 0 flop

This is a known issue; we'll try and find a way to circumvent the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.