Code Monkey home page Code Monkey logo

correctionlib's Introduction

correctionlib

Actions Status Documentation Status Code style: black

conda version PyPI version PyPI platforms

GitHub Discussion

Introduction

The purpose of this library is to provide a well-structured JSON data format for a wide variety of ad-hoc correction factors encountered in a typical HEP analysis and a companion evaluation tool suitable for use in C++ and python programs. Here we restrict our definition of correction factors to a class of functions with scalar inputs that produce a scalar output.

In python, the function signature is:

from typing import Union

def f(*args: Union[str,int,float]) -> float:
    return ...

In C++, the evaluator implements this currently as:

double Correction::evaluate(const std::vector<std::variant<int, double, std::string>>& values) const;

The supported function classes include:

  • multi-dimensional binned lookups;
  • binned lookups pointing to multi-argument formulas with a restricted math function set (exp, sqrt, etc.);
  • categorical (string or integer enumeration) maps;
  • input transforms (updating one input value in place); and
  • compositions of the above.

Each function type is represented by a "node" in a call graph and holds all of its parameters in a JSON structure, described by the JSON schema. Possible future extension nodes might include weigted sums (which, when composed with the others, could represent a BDT) and perhaps simple MLPs.

The tool should provide:

  • standardized, versioned JSON schemas;
  • forward-porting tools (to migrate data written in older schema versions); and
  • a well-optimized C++ evaluator and python bindings (with numpy vectorization support).

This tool will definitely not provide:

  • support for TLorentzVector or other object-type inputs (such tools should be written as a higher-level tool depending on this library as a low-level tool)

Formula support currently includes a mostly-complete subset of the ROOT library TFormula class, and is implemented in a threadsafe standalone manner. The parsing grammar is formally defined and parsed through the use of a header-only PEG parser library. The supported features mirror CMSSW's reco::formulaEvaluator and fully passes the test suite for that utility with the purposeful exception of the TMath:: namespace. The python bindings may be able to call into numexpr, though, due to the tree-like structure of the corrections, it may prove difficult to exploit vectorization at levels other than the entrypoint.

Detailed instructions for installing and using this package are provided in the documentation.

Creating new corrections

The correctionlib.schemav2 module provides a helpful framework for defining correction objects and correctionlib.convert includes select conversion routines for common types. Nodes can be type-checked as they are constructed using the parse_obj class method or by directly constructing them using keyword arguments. Some examples can be found in data/conversion.py. The tests/ directory may also be helpful.

Developing

See CONTRIBUTING.md

correctionlib's People

Contributors

nsmith- avatar dependabot[bot] avatar izaakwn avatar pieterdavid avatar kirschen avatar lgray avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.