jelleaalbers / blueice Goto Github PK
View Code? Open in Web Editor NEWBuild Likelihoods Using Efficient Interpolations and monte-Carlo generated Events
License: BSD 3-Clause "New" or "Revised" License
Build Likelihoods Using Efficient Interpolations and monte-Carlo generated Events
License: BSD 3-Clause "New" or "Revised" License
When h.rebin(1,.5) even to an integer fraction of the bins there seems to be a loss of event while you expect to get the summation of the bins to larger bins.
At the moment Model doesn't pickle, so the models that are computed in parallel during LogLikelihood.prepare() can't actually be returned. It still works because the computed PDFs end up in the PDF cache. We can them load them in serially in the main process much faster... as long as the main process and the ipyparallel engines share the same pdf_cache location. If not, e.g. if the engines are actually on a different machine, parallel computation doesn't work yet.
The implementation for rate parameters assumes the fraction of events that fall in the analysis space is the same for every model. This is not true when there are shape parameters.
A related problem exists for rate uncertainties (rate parameters with a prior). The prior is currently defined over the absolute rate, but this is affected by shape parameters too.
I'm going to switch rate parameters to be multipliers of the rate get out of this mess. This will mean a few examples in laidbax have to be changed.
It would be nice if there was a bestfit_minuit
just like bestfit_scipy
. Physicists like minuit.
According to https://github.com/JelleAalbers/blueice/blob/master/blueice/inference.py#L206, our minuit implementation does not yet account for bounds. That's probably OK, since the likelihood function just gives -float('inf') outside physical bounds, but maybe minuit would appreciate (and perhaps be more reliable) if we tell it our bounds more gently.
If we're planning to use or report the errors from minuit at some point, we should also set the error_def
parameter to 0.5. According to https://nbviewer.jupyter.org/github/iminuit/iminuit/blob/master/tutorial/basic_tutorial.ipynb this sets up the right magic for negative log likelihood functions.
Currently our travis builds are failing with the very unhelpful "core dumped" message. I first thought it was related to iminuit, but removing all iminuit installation instructions hasn't helped. I hope this is a temporary problem on the travis CI side, otherwise we may have to use stuff like https://github.com/springmeyer/travis-coredump to diagnose it.
Currently LogLikelihood.__call__
accepts **kwargs, and searches it for arguments corresponding to known rate and shape parameters. Thus, if you mistype a parameter, it will be silently ignored, and it appears the parameter does nothing. It would be better if you get an error when this happens instead.
When starting multiple batch jobs on Midway, it is necessary to run a "burn-in" run beforehand to avoid multiple jobs attempting to write to a cache file, corrupting it. Deleting and re-run is then required.
In some analyses, the analysis may be performed in a space that partially depends on nuisance parameters- e.g. reconstructed energy. Source and model classes could be modified to account for this, for example by interpolating the derived quantities between anchor points.
Travis has shut down its travis-ci.org site and no longer offers free builds for open-source projects. We should migrate to github actions to maintain continuous integration testing if people still want to develop blueice.
Currently, blueice instantiates a model grid in the anchor point grid spanned by the shape parameter variations from all sources. This extends to the pdf interpolation.
The pdfs are cached, and only loaded into memory once, so that the second problem poses a larger challenge than the first.
Proposed steps:
I am looking at adding functionality for binned likelihoods in blueice (the first wimp analysis will probably be binned, with bins derived from calibration data)
As the statistics available greatly over the analysis range, I believe bin size and shape should be as flexible as possible— my preference would be to use a function of the event data that returns a bin index.
My first idea was to simply re-purpose the XENONSource- add an analysis variable called "bin_index", compute that for each event, and use the index as an analysis variable. However, as both Source and Likelihood classes would need modification, my current approach is to create a BinnedSource, (Model) and Likelihood class inheriting from the un-binned counterparts and re-implementing what is needed. Does this sound sensible?
Blueice's minuit wrappers assume the minuit 1 API. The minuit 2 API is slightly different, see here. Unfortunately this means blueice's minuit inference methods no longer work in post-2020 setup; we instead get an error about a missing 'print_level' initialization argument. That could be all, or it could be the tip of the iceberg.
Currently you can only vary parameters of the model as shape uncertainties, not parameters of each individual source (such as its energy distribution).
Currently the statistical errors on a PDF for a given combination of parameters are assumed to be negligible. When you make a PDF from MC, you can often get to this happy point if you are patient.. but not when deriving a PDF from data.
It would be nice if there was an option to have a parameter to vary the expectation in each bin in each PDF used, and a corresponding Poisson term in the likelihood -- or at least on such parameter/erm for each bin of the total PDF. However:
It might be desirable to subtract two pdfs. (say, to avoid double-counting of a contaminant of a calibration source)
Proposed implementation:
-with a suitable flag, allow a source rate_multiplier to be negative
-if such a flag is set, truncate the summed pdf to be above 0.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.