Code Monkey home page Code Monkey logo

blueice's People

Contributors

adambrown1 avatar dachengx avatar hammannr avatar jelleaalbers avatar jingqiangye avatar kdund avatar pelssers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blueice's Issues

rebining to large bins losses events.

When h.rebin(1,.5) even to an integer fraction of the bins there seems to be a loss of event while you expect to get the summation of the bins to larger bins.

Make Model picklable, or otherwise support remote parallel computation

At the moment Model doesn't pickle, so the models that are computed in parallel during LogLikelihood.prepare() can't actually be returned. It still works because the computed PDFs end up in the PDF cache. We can them load them in serially in the main process much faster... as long as the main process and the ipyparallel engines share the same pdf_cache location. If not, e.g. if the engines are actually on a different machine, parallel computation doesn't work yet.

Rate parameter malfunctions when there are also shape parameters

The implementation for rate parameters assumes the fraction of events that fall in the analysis space is the same for every model. This is not true when there are shape parameters.

A related problem exists for rate uncertainties (rate parameters with a prior). The prior is currently defined over the absolute rate, but this is affected by shape parameters too.

I'm going to switch rate parameters to be multipliers of the rate get out of this mess. This will mean a few examples in laidbax have to be changed.

Minuit integration

It would be nice if there was a bestfit_minuit just like bestfit_scipy. Physicists like minuit.

Pass more settings to minuit in bestfit_minuit

According to https://github.com/JelleAalbers/blueice/blob/master/blueice/inference.py#L206, our minuit implementation does not yet account for bounds. That's probably OK, since the likelihood function just gives -float('inf') outside physical bounds, but maybe minuit would appreciate (and perhaps be more reliable) if we tell it our bounds more gently.

If we're planning to use or report the errors from minuit at some point, we should also set the error_def parameter to 0.5. According to https://nbviewer.jupyter.org/github/iminuit/iminuit/blob/master/tutorial/basic_tutorial.ipynb this sets up the right magic for negative log likelihood functions.

Validate arguments to likelihood function

Currently LogLikelihood.__call__ accepts **kwargs, and searches it for arguments corresponding to known rate and shape parameters. Thus, if you mistype a parameter, it will be silently ignored, and it appears the parameter does nothing. It would be better if you get an error when this happens instead.

Add derived quantities

In some analyses, the analysis may be performed in a space that partially depends on nuisance parameters- e.g. reconstructed energy. Source and model classes could be modified to account for this, for example by interpolating the derived quantities between anchor points.

Move shape parameter grids, interpolation to be per source

Currently, blueice instantiates a model grid in the anchor point grid spanned by the shape parameter variations from all sources. This extends to the pdf interpolation.

The pdfs are cached, and only loaded into memory once, so that the second problem poses a larger challenge than the first.

Proposed steps:

  1. Create a wrapper "interpolator" that contains (and passes relevant parameters to) per-source interpolators in only the relevant shape directions for each
  2. Consider modifying or removing the model layer, possible solutions:
    2.1) Remove model layer completely, letting the likelihood call a collection of sources (possibly with a source_collection class that provides access to the sources.(similar to 2.2)
    2.2) Change the model to be per physical model-- instantiate one model for ER, one for NR etc. and change "source" to be anchor_source or model_point
    2.3) maintain current structure, while moving most of the functionality (interpolation etc) to be per-source.

Binned likelihoods, pdfs

I am looking at adding functionality for binned likelihoods in blueice (the first wimp analysis will probably be binned, with bins derived from calibration data)
As the statistics available greatly over the analysis range, I believe bin size and shape should be as flexible as possible— my preference would be to use a function of the event data that returns a bin index.

My first idea was to simply re-purpose the XENONSource- add an analysis variable called "bin_index", compute that for each event, and use the index as an analysis variable. However, as both Source and Likelihood classes would need modification, my current approach is to create a BinnedSource, (Model) and Likelihood class inheriting from the un-binned counterparts and re-implementing what is needed. Does this sound sensible?

Minuit 2 support

Blueice's minuit wrappers assume the minuit 1 API. The minuit 2 API is slightly different, see here. Unfortunately this means blueice's minuit inference methods no longer work in post-2020 setup; we instead get an error about a missing 'print_level' initialization argument. That could be all, or it could be the tip of the iceberg.

MC/calibration data statistical uncertainty nuisance parameters

Currently the statistical errors on a PDF for a given combination of parameters are assumed to be negligible. When you make a PDF from MC, you can often get to this happy point if you are patient.. but not when deriving a PDF from data.

It would be nice if there was an option to have a parameter to vary the expectation in each bin in each PDF used, and a corresponding Poisson term in the likelihood -- or at least on such parameter/erm for each bin of the total PDF. However:

  • Minimizing these guys might be a tough cookie for the minimizer... the LHC folks have some special magic for this (Beeston-Barlow light) that may be worth looking into;
  • This won't generalize to using KDEs as density estimators, for which we might have to do e.g. bootstrapping of the calibration/MC data;
  • When you don't have enough counts in a bin you are in trouble anyway (what's the error on a bin in which you see no events? Surely not 0..)

Allow negative rate multipliers

It might be desirable to subtract two pdfs. (say, to avoid double-counting of a contaminant of a calibration source)

Proposed implementation:
-with a suitable flag, allow a source rate_multiplier to be negative
-if such a flag is set, truncate the summed pdf to be above 0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.