root-project / rootbench Goto Github PK

Collection of benchmarks and performance monitoring applications

License: GNU Lesser General Public License v2.1

CMake 4.86% C++ 82.80% Shell 0.42% C 2.68% Python 9.24%

rootbench's Introduction

ROOT Benchmarks

This repository contains a set of relatively small programs (usually based on gbenchmark micro benchmarking infrastructure) built on top of ROOT. Their primary goal is to provide stable performance metrics which can be monitored over time.

Results of nightly run of ROOT benchmarks could be discovered in the Rootbench.git Grafana instance: RootBench Grafana.

Project Health


Linux/OSX	Experimental Benchmark Coverage:

Cite

@inproceedings{shadura2019continuous,
  title={Continuous Performance Benchmarking Framework for ROOT},
  author={Shadura, Oksana and Vassilev, Vassil and Bockelman, Brian Paul},
  booktitle={EPJ Web of Conferences},
  volume={214},
  pages={05003},
  year={2019},
  organization={EDP Sciences}
}

Building

ROOTBench can be built standalone and as part of ROOT. If you want to enable ROOTBench for ROOT just add the -Drootbench=On option to your cmake configuration.

Building ROOTBench standalone

ROOTBench should be able to find ROOT at configuration time. Make sure you ran source $ROOTSYS/bin/thisroot.sh.

git clone https://github.com/root-project/rootbench.git
mkdir build
cd build
cmake ../rootbench
cmake --build . -- -j$(nproc)

also, you can use

cmake --build . -- -jN

where 'N' is the maximum number of processor cores you want to use.

Extending the benchmarks

ROOTBench relies on Google Benchmark. We recommend to read the available documentation and browse the existing examples here for more advanced usage.

Background

This repository is being integrated in two steps:

We run TravisCI on each pull request -- the public infrastructure is time limited and we use the latest ROOT nightly build available in CVMFS and EOS. This way we can integrate with public services such as Coveralls. Based on the TravisCI information we compute the benchmarking coverage of ROOTBench against ROOT. The idea is to make sure that we have well-distributed benchmarking coverage.
We run on dedicated CERN OpenLab machines twice a day -- we build ROOT and ROOTBench from scratch and collect performance data. The data is uploaded to our Grafana service available here (requires CERN login).

The integration process depends on the overall benchmarking time. Contributors are encouraged to write well-focused microbenchmarks ensuring good benchmarking coverage. Non-overlapping microbenchmarks seem to be the only reasonable way to control the pressure on the infrastructure.

Conventions

There are several practical conventions that we should follow:

Coding conventions -- ROOTBench follows the coding conventions of ROOT to a great extent.
The routines used for benchmarking shall have the following names BM_CLASSNAME_ROUTINE -- the BM prefix allows us (or tools) to easily identify which is the main benchmarking function.

Simple benchmark template

Add file called CLASSNAMEBenchmarks.cxx where CLASSNAME is the name of the ROOT class we benchmark.

#include "ROOT_HEADER_TO_BENCHMARK.h"

#include "benchmark/benchmark.h"

// Replace the CLASSNAME and ROUTINE with the ROOT class and routine you are benchmarking respectively. 
static void BM_CLASSNAME_ROUTINE(benchmark::State &state) {
  // Initialization section before actual benchmarking.
  for (auto _ : state) {
    // The benchmarking code goes here.
  }
  // Teardown.
}
BENCHMARK(BM_CLASSNAME_ROUTINE);

// In the end of the file we add our main().
BENCHMARK_MAIN();

RB_ADD_GBENCHMARK(CLASSNAMEBenchmarks
  CLASSNAMEBenchmarks.cxx
  LABEL short
  LIBRARIES LIST OF LIB DEPENDENCIES)

This is a very basic working example. If you need extra functionality please read the Google Benchmark Docs.

rootbench's People

Contributors

Stargazers

Watchers

rootbench's Issues

Add a way to pass a specific version of python to rootbench

When I have multiple version of python on my system, rootbench tends to pick the latest one, which might not be the one I configured ROOT with, or will pick inconsistent versions across the different variables.

For instance, when building within the gentoo prefix, where:

$  root-config --python-version
3.8.3

The variables set by CMake are:

PYTHON_EXECUTABLE:INTERNAL=/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/bin/python3.9
PYTHON_INCLUDE_DIRS:INTERNAL=/usr/include/python3.6m
PYTHON_LIBRARIES:INTERNAL=/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/libpython3.9.so

None of which is the one I need. It would be nice to have a mechanism to specify the python version I need to work with.

Integration of the python benchmarks with Influx DB

Don't use averages for grafana display

Graphs in grafana display results averaged over a few days. This waters down improvements/regressions. It would be great to see results jump to new values the day after a merge.

Example: graph at the bottom here:
https://rootbnch-grafana-test.cern.ch/d/ENlOG7EWz/roofit-benchmarks-vectorized-pdfs?orgId=1

This difference should have been a ~ 2x speed up, but you just don't see that.

macro /usr/share/miniconda/envs/root-nightly/tutorials//mlp/mlpHiggs.C not found

2: TestTutorial/Test_nopch_fftw/manual_time               712825 us          419 us            1 RSS=275.712k
2: Warning in <TApplication::GetOptions>: macro /usr/share/miniconda/envs/root-nightly/tutorials//mlp/mlpHiggs.C not found
2: root: unrecognized option '/usr/share/miniconda/envs/root-nightly/tutorials//mlp/mlpHiggs.C'

Option allowing to choose where to save CSV result files and other artifacts

During the benchmark execution, rootbench generates CSV files and other artifacts. At the moment, the path where these files are being saved is not configurable. Would it be a good idea to add an option allowing to choose where to save these files? One of the use cases I can see for this feature would be running rootbench in a Singularity container. Singularity usually uses a read-only filesystem, so it would be great to be able to bind-mount some writable directory to the container and save all of the generated files to this directory.

Fail at configuration time if pytest is not found aka require python for rootbench

Machine running rootbench is out of space

The machine running rootbench is out of space if we run all configurations (compiler/build options), which are currently configured. Following, no data points in the grafana board!

Expose CXX standard in rootbench

We require an exposed CMAKE_CXX_STANDARD in rootbench. Currenlty, it's not there or only in the case of being builtin with root. The hotfix looks like this but does not cover correctly the case "standard > 14" with an external build of rootbench.

Add documentation about adding panels and dashboards

In particular, queries to retrieve datapoints from InfluxDB have some subtleties:

must use InfluxDB, not rootbench-InfluxDB as source (unless you want old data)
must select a single nodelabel and build configuration in the WHERE clause
python benchmarks store runtimes in the duration field, C++ benchmarks use real_time

Oh and you need to press "Save" or lose all your changes :)

Make sure all integration benchmarks run with equal file cache conditions

Currently we have a few integration benchmarks that only execute a single repetition, but some of these benchmarks share input files. As a consequence, the first of these benchmarks would run on a cold file cache and the others on a warm cache. We have to normalize this, e.g. by having each integration test run two iterations and discarding the runtime of the first.

Cannot access Grafana from Firefox (only versions 81 and higher seem affected)

Trying to access address with Firefox https://rootbnch-grafana-test.cern.ch yields the following message:

Clearing cookies or using a private tab did not help.
Chromium works fine.

[Grafana] Configure Grafana dashboards as code

Best news of this year from FOSDEM 2020 talk by Grafana Labs:
https://grafana.com/blog/2020/02/26/how-to-configure-grafana-as-code/

cc: @vgvassilev @eguiraud @stwunsch

Updated Statistical Tests with RooFit

It seems the RooFit test scripts currently included in rootbench don't quite give the depth and complexity required for benchmarking performance of experimental updates (such as the multiprocess PR and upcoming CUDA implementations) is it possible to keep a continuous benchmarking suite with specific tests for high performance development work separate?

Currently the 'binned' example creates between 1 and 3 histfactory channels with between 1 and 4 bins each. I believe this was tuned to get running time for the whole test down to under 20minutes. Is there an option to 'stress test' by returning this to its orginal 20 x 40 dimensions?

The unbinned example creates a simple B-decay example indicative of LHCb style fits however this seems fairly trivial in comparison. Ideas include more complex functions, convolutions, mulidimentional fits, ranges, or even a Dalitz fit (that I think includes all previous suggestions)

A mixed case might also be desirable with a combined binned and unbinned fit, to simulate

@lmoneta @oshadura @hageboeck @guitargeek

Use fixtures to download data for benchmarks

Link to the cmake doc: https://cmake.org/cmake/help/v3.10/prop_test/FIXTURES_REQUIRED.html?highlight=fixture#prop_test:FIXTURES_REQUIRED

Use ctest cleanup fixture to upload results to Influx DB

cmake docs: https://cmake.org/cmake/help/v3.12/prop_test/FIXTURES_CLEANUP.html

Currently we upload the test results in bulk in a jenkins job.

gen_h1 could be a no-op if output file already exists

It would speed up repeated executions of RNTuple tests by quite a bit.

Add scripts to upload data to Influx DB to rootbench

Currently the upload infrastructure is set in a jenkins config, it would be nice to persistify the procedure.

Document in README the need to set the env var RB_TEMP_FS for some tests

Move RNTuple benchmarks to using fixtures to generate dictionaries/data/...

Run only benchmarks in the nightlies?

We run all tests, not only the benchmarks, in the rootbench nightlies. Shouldn't we just run the rootbench stuff?

Publish ROOT's nightly osx builds

That's a prerequisite for PR #13.

Implement an option to globally set/fix the number of parallel processing units

I would like to have a way to globally set/fix the number of processing units the parallel benchmaks will run on.

Some expert reviews of this potential feature:

"We already have the RB_TMP_DIR env variable, we could add another to control the argument passed to EnableImplicitMT" - @eguiraud

"we also have the rootbench datadir as a variable! (...) these vars can be easily exposed through cmake in the ADD_BENCHMARK_FOO functions" - @stwunsch

Check for pytest tests packages dependencies at configuration time

Right now we have the dependencies in requirements.txt. Wouldn't it be nicer to raise an error/warning at configuration time if the requirements are not found and invite to install them beforehand?

Hadd should be always available

Since hadd is heavily used from CMake and from executables, so we need to add routine to ensure we actually have it (in PATH or from the build directory in ROOT and etc.)

Linking error when building on MacOS 10.14

When building ROOT bench on MacOS, there is a linking error due to the missing librt library.

This patch fixes it:

--- a/cmake/modules/AddRootBench.cmake
+++ b/cmake/modules/AddRootBench.cmake
@@ -43,7 +43,8 @@ function(RB_ADD_GBENCHMARK benchmark)
   # FIXME: For better coherence we could restrict the libraries the test suite could link
   # against. For example, tests in Core should link only against libCore. This could be tricky
   # to implement because some ROOT components create more than one library.
-  target_link_libraries(${benchmark} ${ARG_LIBRARIES} gbenchmark RBSupport rt)
+  target_link_libraries(${benchmark} ${ARG_LIBRARIES} gbenchmark RBSupport)
   #ROOT_PATH_TO_STRING(mangled_name ${benchmark} PATH_SEPARATOR_REPLACEMENT "-")
   #ROOT_ADD_TEST(gbench${mangled_name}

BM_RNTuple_H1 currently broken

Here's what I get running ./RNTupleAnalysisBenchmarks on my machine:

BM_RNTuple_H1/BM_RNTuple_H1LZ4/iterations:5        51.5 us         51.2 us            5
BM_RNTuple_H1/BM_RNTuple_H1ZLIB/iterations:5       39.5 us         39.4 us            5
BM_RNTuple_H1/BM_RNTuple_H1LZMA/iterations:5       37.1 us         37.0 us            5
BM_RNTuple_H1/BM_RNTuple_H1ZSTD/iterations:5       36.1 us         36.1 us            5
BM_RNTuple_H1/BM_RNTuple_H1None/iterations:5       37.5 us         37.5 us            5
BM_TTree_H1/BM_TTree_H1LZ4/iterations:5           31416 us        31415 us            5
BM_TTree_H1/BM_TTree_H1ZLIB/iterations:5         226491 us       226484 us            5
BM_TTree_H1/BM_TTree_H1LZMA/iterations:5        1035668 us      1035641 us            5
BM_TTree_H1/BM_TTree_H1ZSTD/iterations:5         224561 us       224556 us            5
BM_TTree_H1/BM_TTree_H1None/iterations:5          28839 us        28840 us            5

Port fixes in opendata-benchmarks to rootbench

The fixes from this PRs should be ported here:
root-project/opendata-benchmarks#11
root-project/opendata-benchmarks#12

Note that we have to reskim the original sample, this time removing the MET_sumet branch but adding MET_pt and MET_phi.

External build is broken due to wmass benchmark

I get following error with an external build

input_line_12:3:10: fatal error: 'inc/classes.h' file not found
#include "inc/classes.h"
         ^~~~~~~~~~~~~~~
Error: /home/stefan/builds/root-dev/bin/rootcling: compilation failure (/home/stefan/builds/rootbench/root/tree/dataframe/wmass/libSignalAnalysis522503ca6b_dictUmbrella.h)
make[2]: *** [root/tree/dataframe/wmass/CMakeFiles/G__SignalAnalysis.dir/build.make:85: root/tree/dataframe/wmass/G__SignalAnalysis.cxx] Error 1
make[1]: *** [CMakeFiles/Makefile2:2011: root/tree/dataframe/wmass/CMakeFiles/G__SignalAnalysis.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

which seems to stem from the following dictionary

ROOT_GENERATE_DICTIONARY(G__SignalAnalysis inc/classes.h LINKDEF LinkDef.h)

Can we build with C++17?

The zpeak benchmark needs C++17, and root7 will also switch to C++17 eventually. Can we switch nightly rootbench builds to C++17?