ASV Benchmark suite for `PairwiseDistancesReductions`

Context

PairwiseDistancesReductions are Cython-based implementations of computational expensive patterns in many scikit-learn's algorithms.

In order to be able to maintain those on the longer-term, maintainers, and authors and reviewers of Pull Requests suggesting changes need to be able to easily and confidently assess performance regressions between revisions.

This independent asv benchmark suite is meant to help in this regards.

For more context, see:

Quick-start

This suite can be installed with:

git clone [email protected]:jjerphan/pairwise-distances-reductions-asv-suite.git 
cd pairwise-distances-reductions-asv-suite
pip install git+https://github.com/airspeed-velocity/asv

This suite can be run with:

# This might take a while (i.e several hours up to a day)
# if all combinations are benchmarked.
asv run

For more precised run, see asv commands' documentation.

Workflow plan

Needs

Have a feedback of performance improvement of regression in timely manner when needed for a scikit-learn Pull Request

In particular:

have a GitHub actions workflow which would be triggerable by a comment
specify revisions to compare (forwarded to asv continuous)
be able to indicate configuration to run benchmarks for, in particular regarding the following parameters' values:
- PairwiseDistancesReductions
- metric
- format of (X,Y) (in {sparse, dense}²)
have the full, verbose, sorted, asv textual report

Have an overview of performance with respect to theoretical ideal limit

In particular:

outputs graphs of hardware scalability
report estimate of sequential code proportion using Amdahl's law

Trace results overtime

Important notes

Benchmark are correctly and entirely reproducible, traceable and reportable when the following constraining requirements are met:

the same machine is used overtime: in practice, we can't expect CI providers to allocate the same machines over time, nor to dispatch to specifications-identical machines at a given time.
no other process that the benchmarks' are run on the machine: in practice, we can't expect CI providers to use process isolation
benchmarks definition aren't changed between revision: this requires not reformatting benchmarks' python code because asv hashes the content of the file to trace benchmark overtime

scikit-learn / pairwise-distances-reductions-asv-suite Goto Github PK

pairwise-distances-reductions-asv-suite's Introduction

ASV Benchmark suite for `PairwiseDistancesReductions`

Context

Quick-start

Workflow plan

Needs

Have a feedback of performance improvement of regression in timely manner when needed for a scikit-learn Pull Request

Have an overview of performance with respect to theoretical ideal limit

Trace results overtime

Important notes

pairwise-distances-reductions-asv-suite's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

scikit-learn / pairwise-distances-reductions-asv-suite Goto Github PK

pairwise-distances-reductions-asv-suite's Introduction

ASV Benchmark suite for PairwiseDistancesReductions

Context

Quick-start

Workflow plan

Needs

Have a feedback of performance improvement of regression in timely manner when needed for a scikit-learn Pull Request

Have an overview of performance with respect to theoretical ideal limit

Trace results overtime

Important notes

pairwise-distances-reductions-asv-suite's People

Watchers

Recommend Projects

Recommend Topics

Recommend Org

ASV Benchmark suite for `PairwiseDistancesReductions`