Code Monkey home page Code Monkey logo

pairwise-distances-reductions-asv-suite's Introduction

ASV Benchmark suite for PairwiseDistancesReductions

Context

PairwiseDistancesReductions are Cython-based implementations of computational expensive patterns in many scikit-learn's algorithms.

In order to be able to maintain those on the longer-term, maintainers, and authors and reviewers of Pull Requests suggesting changes need to be able to easily and confidently assess performance regressions between revisions.

This independent asv benchmark suite is meant to help in this regards.

For more context, see:

Quick-start

This suite can be installed with:

git clone [email protected]:jjerphan/pairwise-distances-reductions-asv-suite.git 
cd pairwise-distances-reductions-asv-suite
pip install git+https://github.com/airspeed-velocity/asv

This suite can be run with:

# This might take a while (i.e several hours up to a day)
# if all combinations are benchmarked.
asv run

For more precised run, see asv commands' documentation.

Workflow plan

Needs

Have a feedback of performance improvement of regression in timely manner when needed for a scikit-learn Pull Request

In particular:

  • have a GitHub actions workflow which would be triggerable by a comment
  • specify revisions to compare (forwarded to asv continuous)
  • be able to indicate configuration to run benchmarks for, in particular regarding the following parameters' values:
    • PairwiseDistancesReductions
    • metric
    • format of (X,Y) (in {sparse, dense}²)
  • have the full, verbose, sorted, asv textual report

Have an overview of performance with respect to theoretical ideal limit

In particular:

  • outputs graphs of hardware scalability
  • report estimate of sequential code proportion using Amdahl's law

Trace results overtime

Important notes

Benchmark are correctly and entirely reproducible, traceable and reportable when the following constraining requirements are met:

  • the same machine is used overtime: in practice, we can't expect CI providers to allocate the same machines over time, nor to dispatch to specifications-identical machines at a given time.
  • no other process that the benchmarks' are run on the machine: in practice, we can't expect CI providers to use process isolation
  • benchmarks definition aren't changed between revision: this requires not reformatting benchmarks' python code because asv hashes the content of the file to trace benchmark overtime

pairwise-distances-reductions-asv-suite's People

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.