pyhf / pyhf-tutorial Goto Github PK
View Code? Open in Web Editor NEWUpdating tutorial on pyhf
Home Page: https://pyhf.github.io/pyhf-tutorial/
License: Apache License 2.0
Updating tutorial on pyhf
Home Page: https://pyhf.github.io/pyhf-tutorial/
License: Apache License 2.0
The launch button seems to be missing for binder. Not sure why, config seems right on a first look-through.
Following PR #61, in the Playing with Toys notebook, in the Hypothesis Testing (low stats) section using the toybased
calculator is yielding nan
s in the calculation:
CLs_obs, CLs_exp = pyhf.infer.hypotest(
1.0, # null hypothesis
[5.0, 7.0] + model.config.auxdata,
model,
test_stat="qtilde",
return_expected_set=True,
calctype="toybased",
ntoys=1000,
track_progress=False,
)
print(f" Observed CLs: {CLs_obs:.4f}")
for expected_value, n_sigma in zip(CLs_exp, np.arange(-2, 3)):
print(f"Expected CLs({n_sigma:2d} σ): {expected_value:.4f}")
/usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_optimize.py:353: RuntimeWarning: Values in x were outside bounds during a minimize step, clipping to bounds
warnings.warn("Values in x were outside bounds during a "
Observed CLs: nan
Expected CLs(-2 σ): 0.0000
Expected CLs(-1 σ): 0.0452
Expected CLs( 0 σ): 0.1756
Expected CLs( 1 σ): 0.4375
Expected CLs( 2 σ): 1.0000
/usr/local/venv/lib/python3.10/site-packages/pyhf/infer/calculators.py:864: RuntimeWarning: invalid value encountered in divide
CLs = tensorlib.astensor(CLsb / CLb)
This is a reminder note that Belle II is now giving pyhf tutorials (:exploding_head:) as well. At the 2023 Belle II Summer Workshop Slavomira Stefkova gave a talk on systematics with pyhf and @lorenzennio gave a Belle II focused pyhf tutotrial.
In conversation with @kratsg we agreed that it would be interesting to learn from these tutorials and see if there are components we can pull into the pyhf user guide / tutorial and if there are experiment specific subsections that we could start to add as well.
I'm not sure why, but the Upper Limits notebook that was added in PR #57 is not running all the way through in the CI that generates the Jupyter Book webpage and is failing at the Model-independent Upper Limits section with a KeyboardInterrupt:
pyhf.set_backend("numpy", "scipy")
mu_tests = np.linspace(15, 25, 10)
(
obs_limit,
exp_limit_and_bands,
(poi_tests, tests),
) = pyhf.infer.intervals.upper_limits.upper_limit(
data, model, scan=mu_tests, level=0.05, return_results=True
)
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[11], line 8
1 pyhf.set_backend("numpy", "scipy")
3 mu_tests = np.linspace(15, 25, 10)
4 (
5 obs_limit,
6 exp_limit_and_bands,
7 (poi_tests, tests),
----> 8 ) = pyhf.infer.intervals.upper_limits.upper_limit(
9 data, model, scan=mu_tests, level=0.05, return_results=True
10 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/intervals/upper_limits.py:252, in upper_limit(data, model, scan, level, return_results, **hypotest_kwargs)
210 """
211 Calculate an upper limit interval ``(0, poi_up)`` for a single Parameter of
212 Interest (POI) using root-finding or a linear scan through POI-space.
(...)
249 .. versionadded:: 0.7.0
250 """
251 if scan is not None:
--> 252 return linear_grid_scan(
253 data, model, scan, level, return_results, **hypotest_kwargs
254 )
255 # else:
256 bounds = model.config.suggested_bounds()[
257 model.config.par_slice(model.config.poi_name).start
258 ]
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/intervals/upper_limits.py:189, in linear_grid_scan(data, model, scan, level, return_results, **hypotest_kwargs)
146 """
147 Calculate an upper limit interval ``(0, poi_up)`` for a single
148 Parameter of Interest (POI) using a linear scan through POI-space.
(...)
186 .. versionadded:: 0.7.0
187 """
188 tb, _ = get_backend()
--> 189 results = [
190 hypotest(mu, data, model, return_expected_set=True, **hypotest_kwargs)
191 for mu in scan
192 ]
193 obs = tb.astensor([[r[0]] for r in results])
194 exp = tb.astensor([[r[1][idx] for idx in range(5)] for r in results])
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/intervals/upper_limits.py:190, in <listcomp>(.0)
146 """
147 Calculate an upper limit interval ``(0, poi_up)`` for a single
148 Parameter of Interest (POI) using a linear scan through POI-space.
(...)
186 .. versionadded:: 0.7.0
187 """
188 tb, _ = get_backend()
189 results = [
--> 190 hypotest(mu, data, model, return_expected_set=True, **hypotest_kwargs)
191 for mu in scan
192 ]
193 obs = tb.astensor([[r[0]] for r in results])
194 exp = tb.astensor([[r[1][idx] for idx in range(5)] for r in results])
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/__init__.py:169, in hypotest(poi_test, data, pdf, init_pars, par_bounds, fixed_params, calctype, return_tail_probs, return_expected, return_expected_set, return_calculator, **kwargs)
157 _check_hypotest_prerequisites(pdf, data, init_pars, par_bounds, fixed_params)
159 calc = utils.create_calculator(
160 calctype,
161 data,
(...)
166 **kwargs,
167 )
--> 169 teststat = calc.teststatistic(poi_test)
170 sig_plus_bkg_distribution, bkg_only_distribution = calc.distributions(poi_test)
172 tb, _ = get_backend()
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/calculators.py:389, in AsymptoticCalculator.teststatistic(self, poi_test)
378 asimov_mu = 1.0 if self.test_stat == 'q0' else 0.0
380 asimov_data, asimov_mubhathat = generate_asimov_data(
381 asimov_mu,
382 self.data,
(...)
387 return_fitted_pars=True,
388 )
--> 389 qmuA_v, (mubhathat_A, muhatbhat_A) = teststat_func(
390 poi_test,
391 asimov_data,
392 self.pdf,
393 self.init_pars,
394 self.par_bounds,
395 self.fixed_params,
396 return_fitted_pars=True,
397 )
398 self.sqrtqmuA_v = tensorlib.sqrt(qmuA_v)
399 self.fitted_pars = HypoTestFitResults(
400 asimov_pars=asimov_mubhathat,
401 free_fit_to_data=muhatbhat,
(...)
404 fixed_poi_fit_to_asimov=mubhathat_A,
405 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/test_statistics.py:234, in qmu_tilde(mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_pars)
229 if par_bounds[pdf.config.poi_index][0] != 0:
230 log.warning(
231 'qmu_tilde test statistic used for fit configuration with POI not bounded at zero.\n'
232 + 'Use the qmu test statistic (pyhf.infer.test_statistics.qmu) instead.'
233 )
--> 234 return _qmu_like(
235 mu,
236 data,
237 pdf,
238 init_pars,
239 par_bounds,
240 fixed_params,
241 return_fitted_pars=return_fitted_pars,
242 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/test_statistics.py:27, in _qmu_like(mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_pars)
19 """
20 Clipped version of _tmu_like where the returned test statistic
21 is 0 if muhat > 0 else tmu_like_stat.
(...)
24 qmu_tilde. Otherwise this is qmu (no tilde).
25 """
26 tensorlib, optimizer = get_backend()
---> 27 tmu_like_stat, (mubhathat, muhatbhat) = _tmu_like(
28 mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_pars=True
29 )
30 qmu_like_stat = tensorlib.where(
31 muhatbhat[pdf.config.poi_index] > mu, tensorlib.astensor(0.0), tmu_like_stat
32 )
33 if return_fitted_pars:
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/test_statistics.py:51, in _tmu_like(mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_pars)
47 tensorlib, optimizer = get_backend()
48 mubhathat, fixed_poi_fit_lhood_val = fixed_poi_fit(
49 mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_val=True
50 )
---> 51 muhatbhat, unconstrained_fit_lhood_val = fit(
52 data, pdf, init_pars, par_bounds, fixed_params, return_fitted_val=True
53 )
54 log_likelihood_ratio = fixed_poi_fit_lhood_val - unconstrained_fit_lhood_val
55 tmu_like_stat = tensorlib.astensor(
56 tensorlib.clip(log_likelihood_ratio, 0.0, max_value=None)
57 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/mle.py:131, in fit(data, pdf, init_pars, par_bounds, fixed_params, **kwargs)
124 # get fixed vals from the model
125 fixed_vals = [
126 (index, init)
127 for index, (init, is_fixed) in enumerate(zip(init_pars, fixed_params))
128 if is_fixed
129 ]
--> 131 return opt.minimize(
132 twice_nll, data, pdf, init_pars, par_bounds, fixed_vals, **kwargs
133 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/optimize/mixins.py:193, in OptimizerMixin.minimize(self, objective, data, pdf, init_pars, par_bounds, fixed_vals, return_fitted_val, return_result_obj, return_uncertainties, return_correlations, do_grad, do_stitch, **kwargs)
190 par_names[index] = None
191 par_names = [name for name in par_names if name]
--> 193 result = self._internal_minimize(
194 **minimizer_kwargs, options=kwargs, par_names=par_names
195 )
196 result = self._internal_postprocess(
197 result, stitch_pars, return_uncertainties=return_uncertainties
198 )
200 _returns = [result.x]
File /usr/local/venv/lib/python3.10/site-packages/pyhf/optimize/mixins.py:52, in OptimizerMixin._internal_minimize(self, func, x0, do_grad, bounds, fixed_vals, options, par_names)
34 def _internal_minimize(
35 self,
36 func,
(...)
42 par_names=None,
43 ):
44 minimizer = self._get_minimizer(
45 func,
46 x0,
(...)
50 par_names=par_names,
51 )
---> 52 result = self._minimize(
53 minimizer,
54 func,
55 x0,
56 do_grad=do_grad,
57 bounds=bounds,
58 fixed_vals=fixed_vals,
59 options=options,
60 )
62 try:
63 assert result.success
File /usr/local/venv/lib/python3.10/site-packages/pyhf/optimize/opt_scipy.py:93, in scipy_optimizer._minimize(self, minimizer, func, x0, do_grad, bounds, fixed_vals, options)
90 else:
91 constraints = []
---> 93 return minimizer(
94 func,
95 x0,
96 method=method,
97 jac=do_grad,
98 bounds=bounds,
99 constraints=constraints,
100 tol=tolerance,
101 options=dict(maxiter=maxiter, disp=bool(verbose), **solver_options),
102 )
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_minimize.py:705, in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
702 res = _minimize_cobyla(fun, x0, args, constraints, callback=callback,
703 **options)
704 elif meth == 'slsqp':
--> 705 res = _minimize_slsqp(fun, x0, args, jac, bounds,
706 constraints, callback=callback, **options)
707 elif meth == 'trust-constr':
708 res = _minimize_trustregion_constr(fun, x0, args, jac, hess, hessp,
709 bounds, constraints,
710 callback=callback, **options)
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_slsqp_py.py:432, in _minimize_slsqp(func, x0, args, jac, bounds, constraints, maxiter, ftol, iprint, disp, eps, callback, finite_diff_rel_step, **unknown_options)
429 c = _eval_constraint(x, cons)
431 if mode == -1: # gradient evaluation required
--> 432 g = append(wrapped_grad(x), 0.0)
433 a = _eval_con_normals(x, cons, la, n, m, meq, mieq)
435 if majiter > majiter_prev:
436 # call callback if major iteration has incremented
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_optimize.py:346, in _clip_x_for_func.<locals>.eval(x)
344 def eval(x):
345 x = _check_clip_x(x, bounds)
--> 346 return func(x)
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py:273, in ScalarFunction.grad(self, x)
271 if not np.array_equal(x, self.x):
272 self._update_x_impl(x)
--> 273 self._update_grad()
274 return self.g
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py:256, in ScalarFunction._update_grad(self)
254 def _update_grad(self):
255 if not self.g_updated:
--> 256 self._update_grad_impl()
257 self.g_updated = True
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py:173, in ScalarFunction.__init__.<locals>.update_grad()
171 self._update_fun()
172 self.ngev += 1
--> 173 self.g = approx_derivative(fun_wrapped, self.x, f0=self.f,
174 **finite_diff_options)
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_numdiff.py:505, in approx_derivative(fun, x0, method, rel_step, abs_step, f0, bounds, sparsity, as_linear_operator, args, kwargs)
502 use_one_sided = False
504 if sparsity is None:
--> 505 return _dense_difference(fun_wrapped, x0, f0, h,
506 use_one_sided, method)
507 else:
508 if not issparse(sparsity) and len(sparsity) == 2:
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_numdiff.py:576, in _dense_difference(fun, x0, f0, h, use_one_sided, method)
574 x = x0 + h_vecs[i]
575 dx = x[i] - x0[i] # Recompute dx as exactly representable number.
--> 576 df = fun(x) - f0
577 elif method == '3-point' and use_one_sided[i]:
578 x1 = x0 + h_vecs[i]
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_numdiff.py:456, in approx_derivative.<locals>.fun_wrapped(x)
455 def fun_wrapped(x):
--> 456 f = np.atleast_1d(fun(x, *args, **kwargs))
457 if f.ndim > 1:
458 raise RuntimeError("`fun` return value has "
459 "more than 1 dimension.")
File /usr/local/venv/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py:137, in ScalarFunction.__init__.<locals>.fun_wrapped(x)
133 self.nfev += 1
134 # Send a copy because the user may overwrite it.
135 # Overwriting results in undefined behaviour because
136 # fun(self.x) will change self.x, with the two no longer linked.
--> 137 fx = fun(np.copy(x), *args)
138 # Make sure the function returns a true scalar
139 if not np.isscalar(fx):
File /usr/local/venv/lib/python3.10/site-packages/pyhf/optimize/opt_numpy.py:30, in wrap_objective.<locals>.func(pars)
28 pars = tensorlib.astensor(pars)
29 constrained_pars = stitch_pars(pars)
---> 30 return objective(constrained_pars, data, pdf)[0]
File /usr/local/venv/lib/python3.10/site-packages/pyhf/infer/mle.py:53, in twice_nll(pars, data, pdf)
12 def twice_nll(pars, data, pdf):
13 r"""
14 Two times the negative log-likelihood of the model parameters, :math:`\left(\mu, \boldsymbol{\theta}\right)`, given the observed data.
15 It is used in the calculation of the test statistic, :math:`t_{\mu}`, as defined in Equation (8) in :xref:`arXiv:1007.1727`
(...)
51 Tensor: Two times the negative log-likelihood, :math:`-2\ln L\left(\mu, \boldsymbol{\theta}\right)`
52 """
---> 53 return -2 * pdf.logpdf(pars, data)
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:945, in Model.logpdf(self, pars, data)
938 if data.shape[-1] != self.nominal_rates.shape[-1] + len(
939 self.config.auxdata
940 ):
941 raise exceptions.InvalidPdfData(
942 f'eval failed as data has len {data.shape[-1]} but {self.config.nmaindata + self.config.nauxdata} was expected'
943 )
--> 945 result = self.make_pdf(pars).log_prob(data)
947 if (
948 not self.batch_size
949 ): # force to be not scalar, should we changed with #522
950 return tensorlib.reshape(result, (1,))
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:907, in Model.make_pdf(self, pars)
896 """
897 Construct a pdf object for a given set of parameter values.
898
(...)
904
905 """
906 pdfobjs = []
--> 907 mainpdf = self.main_model.make_pdf(pars)
908 if mainpdf:
909 pdfobjs.append(mainpdf)
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:636, in _MainModel.make_pdf(self, pars)
635 def make_pdf(self, pars):
--> 636 lambdas_data = self.expected_data(pars)
637 return prob.Independent(prob.Poisson(lambdas_data))
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:698, in _MainModel.expected_data(self, pars, return_by_sample)
696 tensorlib, _ = get_backend()
697 pars = tensorlib.astensor(pars)
--> 698 deltas, factors = self._modifications(pars)
700 allsum = tensorlib.concatenate(deltas + [self.nominal_rates])
702 nom_plus_delta = tensorlib.sum(allsum, axis=0)
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:657, in _MainModel._modifications(self, pars)
653 def _modifications(self, pars):
654 deltas = list(
655 filter(
656 lambda x: x is not None,
--> 657 [self.modifiers_appliers[k].apply(pars) for k in self._delta_mods],
658 )
659 )
660 factors = list(
661 filter(
662 lambda x: x is not None,
663 [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
664 )
665 )
667 return deltas, factors
File /usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py:657, in <listcomp>(.0)
653 def _modifications(self, pars):
654 deltas = list(
655 filter(
656 lambda x: x is not None,
--> 657 [self.modifiers_appliers[k].apply(pars) for k in self._delta_mods],
658 )
659 )
660 factors = list(
661 filter(
662 lambda x: x is not None,
663 [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
664 )
665 )
667 return deltas, factors
File /usr/local/venv/lib/python3.10/site-packages/pyhf/modifiers/histosys.py:169, in histosys_combined.apply(self, pars)
166 else:
167 histosys_alphaset = self.param_viewer.get(pars)
--> 169 results_histo = self.interpolator(histosys_alphaset)
170 # either rely on numerical no-op or force with line below
171 results_histo = tensorlib.where(
172 self.histosys_mask, results_histo, self.histosys_default
173 )
File /usr/local/venv/lib/python3.10/site-packages/pyhf/interpolators/code4p.py:85, in code4p.__call__(self, alphasets)
80 alphas_times_deltas_dn = tensorlib.einsum(
81 'sa,shb->shab', alphasets, self.deltas_dn
82 )
84 # for |a| < 1
---> 85 asquare = tensorlib.power(alphasets, 2)
86 tmp1 = asquare * 3.0 - 10.0
87 tmp2 = asquare * tmp1 + 15.0
File /usr/local/venv/lib/python3.10/site-packages/pyhf/tensor/numpy_backend.py:287, in numpy_backend.power(self, tensor_in_1, tensor_in_2)
286 def power(self, tensor_in_1: Tensor[T], tensor_in_2: Tensor[T]) -> ArrayLike:
--> 287 return np.power(tensor_in_1, tensor_in_2)
KeyboardInterrupt:
Something is going wrong in the build, and this should get diagnosed. It might be some timeout issue as this cell takes the longest to run.
Came up as part of #76 which is a good way to motivate the usage of batched models.
Given @sabinekraml's good GitHub Discussion question (that we've seen before in Issues) scikit-hep/pyhf#1619, along with scikit-hep/pyhf#1620 we should make this more explicit by adding a notebook that demonstrates the differences between the expected and observed CLs values given
If you switch from observed data to expected data — your background-only model will have NPs centered at different values because they're fit to different datasets. The expected CLs has to be different.
Hi! I saw that there are a bunch of errors thrown in the current docs:
Hi!
While clicking through the tutorials with pyhf 0.7.0. I noticed that in "SimpleWorkspace", the cell containing
workspace.get_measurement(poi_name="mu")
failed because the signature seems to have changed: get_measurement() got an unexpected keyword argument 'poi_name'
Similarly, in "Combinations" Cell 5 fails as the combined_workspace
object doesn't seem to have a parameters
attribute.
I'm not entirely sure about this second one though as it also fails with an AttributeError here: https://pyhf.github.io/pyhf-tutorial/Combinations.html
Is this intended behavior?
I know these are very small things, however as a new user any unexpected behavior is a bit confusing.
Cheers,
Moritz
Hi all, looking through the tutorial page I've noticed that a tutorial was broken, namely https://pyhf.github.io/pyhf-tutorial/Toys.html#hypothesis-testing-revisited (if you scroll down to the very end of the output)
I assume it's a minor thing that you will be able to immediately fix, so I didn't dig into the error itself
Given problems in PR #57 of building a pip-compile
based lock file targeting Linux x86_64 when on a Apple silicon Mac, it is probably preferable to switch to building a lock file using a python:3.11
Docker container that can then force the environment to be x86_64
, even if emulation is needed.
The workflow should also having @matthewfeickert's personal path information in the lock file, like it is now:
pyhf-tutorial/book/requirements.lock
Line 5 in f913d9d
as this injects noise in the diffs if anyone else builds it.
As brought up by @klieret in Issue #18, changing
pyhf-tutorial/.github/workflows/deploy-jupyter-book.yml
Lines 38 to 39 in 1821e1a
to deploy on github.event_name != 'pull_request'
would allow for deployment on workflow_dispatch
.
Originally posted by @matthewfeickert in #18 (comment)
This issue describes the way of presenting SUSY limits as it has been agreed between ATLAS and CMS. An example from JHEP 12 (2019) 060 can be seen here:
Below an implementation of the calculation of the above CLs value using pyhf
and cabinetry
is presented.
The workspace is created by a cabinetry
configuration file. Is should be noted that this workspace includes multiple signal models, i.e. multiple signal Samples
with different NormFactors
each. An example of such configuration file can be seen here: config_excl_AnalysisChallenge_grid.yml.
# get model and observed data (pre-fit)
model, data = cabinetry.model_utils.model_and_data(ws)
channels = model.config.channels
# workspace and model without signal theory uncertainties. signal theory uncertainties have the string '_Signal_' in their name.
signal_theory_names = [systematic['Name'] for systematic in cabinetry_config['Systematics'] if '_Signal_' in systematic['Name']]
pruned_ws = pyhf.Workspace(ws)._prune_and_rename(prune_modifiers=signal_theory_names)
pruned_model, _ = cabinetry.model_utils.model_and_data(pruned_ws)
# signal normalisation parameter names (having the string 'Signal_norm' in their name).
signal_norms = [param for param in model.config.par_names if 'Signal_norm' in param]
# signal normalisation parameters indices for both full and pruned models
signal_norms_idx = [model.config.par_map[norm]['slice'].start for norm in signal_norms]
pruned_signal_norms_idx = [pruned_model.config.par_map[norm]['slice'].start for norm in signal_norms]
# signal theory modifier indices
signal_theory_idx = [model.config.par_map[modifier]['slice'].start for modifier in signal_theory_names]
def fix_and_calculate_CLs(param):
'''
Fix all signal normalisation parameters to zero except the one specified by param
then calculate expected and observed CLs values
'''
# Expected CLs
print(f"Calculating explected CLs for {param}")
# get the index of the param, this will be the new POI
pruned_poi_index = cabinetry.model_utils._poi_index(pruned_model, poi_name=param)
# set POI to param
pruned_model.config.set_poi(pruned_model.config.par_names[pruned_poi_index])
# indices of all the signal normalisation parameters except the poi_idx
pruned_fixed_signal_norms_idx = [idx for idx in pruned_signal_norms_idx if idx!=pruned_poi_index]
# set the rest of signal normalisation parameters fixed at zero
pruned_fix_pars = pruned_model.config.suggested_fixed().copy()
pruned_init_pars = pruned_model.config.suggested_init().copy()
for idx in pruned_fixed_signal_norms_idx:
pruned_fix_pars[idx] = True
pruned_init_pars[idx] = 0.0
# calculate expected CLs with respect to POI
try:
_, CLs_exp_band = pyhf.infer.hypotest(1.0, asimov_dataset, pruned_model, init_pars=pruned_init_pars, fixed_params=pruned_fix_pars, return_expected_set=True)
# keep only ±1 sigma and convert to simple floats
CLs_exp_band = list(map(lambda x: float(x), CLs_exp_band[1:4]))
except:
CLs_exp_band = None
# Observed CLs
print(f"Calculating observed CLs for {param}")
# get the index of the param, this will be the new POI
poi_index = cabinetry.model_utils._poi_index(model, poi_name=param)
# set POI to param
model.config.set_poi(model.config.par_names[poi_index])
# indices of all the signal normalisation parameters except the poi_idx
fixed_signal_norms_idx = [idx for idx in signal_norms_idx if idx!=poi_index]
# set the rest of signal normalisation parameters fixed at zero
fix_pars = model.config.suggested_fixed().copy()
init_pars = model.config.suggested_init().copy()
for idx in fixed_signal_norms_idx:
fix_pars[idx] = True
init_pars[idx] = 0.0
CLs_obs_band = []
# run three fits with different fixed signal theory NPs
for signal_theory_np in [-1.0, 0.0, 1.0]:
# fix signal theory NPs
for idx in signal_theory_idx:
fix_pars[idx] = True
init_pars[idx] = signal_theory_np
# calculate observed CLs with respect to POI
try:
CLs_obs = pyhf.infer.hypotest(1.0, data, model, init_pars=init_pars, fixed_params=fix_pars)
# convert to simple floats
CLs_obs = float(CLs_obs)
except:
CLs_obs = None
# save results
CLs_obs_band.append(CLs_obs)
return [CLs_obs_band, CLs_exp_band]
# results dictionary
post_fit_significance_results = {}
# loop over the signal parameters of the workspace and calculate
for param in signal_norms:
result = fix_and_calculate_CLs[param]
post_fit_significance_results[param] = result
- Expected Events
References:
Is there a reason that these references are chosen? Is there some HEP reference paper on stats that cites the J. Phys. Chem. A 2001 paper? If not, let's check the PDG to see if this is included there, unless there was something particularly striking about this reference.
- Discovery
$p$ -valuesFrom Equation 52 in arXiv:1503.07622:
Maybe I need to read this section more carefully, but how does Equation 52 show the following statements? Regardless of how it does, can we make this more explicit here?
$p_{\mu}$ (and$\mathrm{CL}_b$ )
$\mathrm{CL}_\mathrm{b}$ provides a measure of compatibility of the observed data with the 95% CL signal strength hypothesis relative to fluctuations of the backgroundThis is mixing multiple ideas of an upper limit signal strength and the definition of
$\mathrm{CL}_b$ .$\mathrm{CL}_b$ is just a statement of the$p$ -value of the background only hypothesis given the observed test-statistic, and then flipped with the$1-p_b$ part for geometric reasons for use in the ratio that gives$\mathrm{CL}_s$ — it is a construction that exists only by necessity for the$\mathrm{CL}_s$ method. The$\mathrm{CL}_b$ makes no statement on what is the observed test statistic that you use. Here we are choosing the test stat for the 95% CL upper limit but that is just a choice as we wanted a test size of$\alpha=0.05$ , but it could have been anything. So I we should not include the test size in the definition (3 line equality) of$\mathrm{CL}_b$ .See the paragraph above plot on page two in this document.
Is there another reference that we can use? I know that this one comes up in search results because it is the only bloody thing that isn't firewalled by ATLAS, but I've never been a huge fan of this note and it kinda bugs me that it doesn't have any stable reference. It isn't even on CDS (private to ATLAS) as far as I know, but I would love to be wrong on this!
$p_0$
$p(s)=0$ measures compatibility of the observed data with the background-only (zero signal strength) hypothesis relative to fluctuations of the background.
$p$ -values do not measure compatibility of a model and data. I know that the caption for Table 20 literally says this, but it is wrong.$p$ -values are the probability given a model hypothesis to have observed a test statistic as extreme or more so (extreme in terms of disagreement with the model hypothesis) than the observed test statistic. They can not make statements on model compatibility with the data. So I would suggest we rephrase this section.Making the table
- It would probably be useful to mention that
DRInt_cuts
should correspond to Signal Region "DR-Int-EWK" and comment on why the results we get here are slightly different form the published table. This is maybe beating the point over the head, but I think it would help (would help 02:00 Matthew!).
Originally posted by @matthewfeickert in #57 (review)
In PR #57 there was some disagreement on the accuracy of different statistical descriptions that had been provided as template statements by ATLAS. This is not an ATLAS document so the descriptions of the statistical procedure should be arrived at independently .
Hi,
In the following sentence:
Nicely enough, pyhf is smart enough to let us know that, if for a model where
, we’re not going to be able to catch it because our default bounds were bounding the parameter of interest from
below at 0. For more information, see Equation 14 in [[10.7.2727]](https://arxiv.org/abs/10.7.2727).
The link to the arxiv paper does not work.
Could you please fix it?
Thanks
Aman
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.