xenon1t / hax Goto Github PK
View Code? Open in Web Editor NEWHandy Analysis for XENON (reduce processed data)
Handy Analysis for XENON (reduce processed data)
Right know hax does a lot of stuff on import:
It would be better to have a hax.init(config_file, **options) that you always call before using hax. This way you can ensure the right pax class gets loaded, the right data dir gets used. Now you have to modify internal hax datastructures, then call some internal functions, etc.
Moreover, if the default pax event class isn't the one you want to load, you need to load the one you want yourself, and you get a whole bunch of warnings about loading everything twice.
It would be nice if we could query multiple runs at a time with get_run_info, to write something like
lifetimes = hax.runs.get_run_info(my_list_of_run_numbers, 'processor.DEFAULT.electron_lifetime_liquid')
The tag version printing logic introduced in hax.runs.tags_selection (https://github.com/XENON1T/hax/blob/master/hax/runs.py#L342) assumes there is always an include tag. If you try to do e.g.
hax.runs.tags_selection(exclude=['bad', 'worse', 'terrible'])
you will now get an error ('NoneType' object is not iterable).
Since LargestPeakProperties is missing 'peaks.reconstructed_positions*' in its extra_branches list, its code for finding peak positions (https://github.com/XENON1T/hax/blob/master/hax/treemakers/common.py#L244) never succeeds. Consequently the branches s2_x etc. created by this treemaker are always nan.
Incidentally, this minitree has some branches that clash with Basics (e.g. s2_area_fraction_top), but they do not have the same meaning (in Basics the s2 refers to the main S2, in LargestPeakProperties the largest S2).
When you load minitrees with multiple processes using the num_workers option, but the load crashes or is interrupted, it seems to be possible that some process remain alive even after restarting your notebook kernel. This might be one of the reasons many people have a lot of processes open on the jupyterhub.
I'm not sure what we can do against this, except perhaps investigate and try to isolate the issue. If we have a clear example we can report it upstream to dask or jupyterhub (wherever the problem seems to lie).
Hi all,
I have installed pax
with following packages and it is perfectly working, but hax
throws now a segmentation violation
. Do I need to install a special pyROOT
or root-numpy
version?
wget https://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh
bash Anaconda3-4.3.0-Linux-x86_64.sh
export PATH=/home/l-althueser/anaconda3/bin:$PATH
conda config --add channels defaults
conda config --add channels http://conda.anaconda.org/NLeSC
conda create -q -n pax python=3.4 root=6.04 toolz numpy scipy matplotlib pandas cython h5py numba pip python-snappy pytables scikit-learn rootpy psutil jupyter root_pandas
source activate pax
pip install coveralls nose coverage
pip install mongodbproxy
git clone https://github.com/XENON1T/pax.git
cd pax
python setup.py develop
paxer --version
conda install root_numpy
git clone https://github.com/XENON1T/hax.git
cd hax
python setup.py develop
haxer --version
$ haxer --version
ERROR:ROOT.TUnixSystem.DispatchSignals] segmentation violation
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/bin/haxer", line 6, in <module>
ERROR:stack] exec(compile(open(__file__).read(), __file__, 'exec'))
ERROR:stack] File "/home/l-althueser/hax/bin/haxer", line 6, in <module>
ERROR:stack] import hax
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/hax/hax/__init__.py", line 13, in <module>
ERROR:stack] from . import misc, minitrees, paxroot, pmt_plot, raw_data, runs, utils, treemakers, data_extractor, \
ERROR:stack] File "<frozen importlib._bootstrap>", line 2284, in _handle_fromlist
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/hax/hax/minitrees.py", line 15, in <module>
ERROR:stack] from .paxroot import loop_over_dataset, function_results_datasets
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/hax/hax/paxroot.py", line 14, in <module>
ERROR:stack] from pax.plugins.io.ROOTClass import load_event_class, load_pax_event_class_from_root, ShutUpROOT
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/ROOT.py", line 301, in _importhook
ERROR:stack] return _orig_ihook( name, *args, **kwds )
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/pax/pax/plugins/io/ROOTClass.py", line 17, in <module>
ERROR:stack] from pax import plugin, datastructure, exceptions
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/ROOT.py", line 301, in _importhook
ERROR:stack] return _orig_ihook( name, *args, **kwds )
ERROR:stack] File "<frozen importlib._bootstrap>", line 2284, in _handle_fromlist
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/ROOT.py", line 301, in _importhook
ERROR:stack] return _orig_ihook( name, *args, **kwds )
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/pax/pax/plugin.py", line 16, in <module>
ERROR:stack] from pax import dsputils
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/ROOT.py", line 301, in _importhook
ERROR:stack] return _orig_ihook( name, *args, **kwds )
ERROR:stack] File "<frozen importlib._bootstrap>", line 2284, in _handle_fromlist
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/ROOT.py", line 301, in _importhook
ERROR:stack] return _orig_ihook( name, *args, **kwds )
ERROR:stack] File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
ERROR:stack] File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
ERROR:stack] File "<frozen importlib._bootstrap>", line 1129, in _exec
ERROR:stack] File "<frozen importlib._bootstrap>", line 1471, in exec_module
ERROR:stack] File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ERROR:stack] File "/home/l-althueser/pax/pax/dsputils.py", line 9, in <module>
ERROR:stack] nopython=True)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/decorators.py", line 176, in wrapper
ERROR:stack] disp.compile(sig)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/dispatcher.py", line 532, in compile
ERROR:stack] cres = self._compiler.compile(args, return_type)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/dispatcher.py", line 81, in compile
ERROR:stack] flags=flags, locals=self.locals)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 693, in compile_extra
ERROR:stack] return pipeline.compile_extra(func)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 350, in compile_extra
ERROR:stack] return self._compile_bytecode()
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 658, in _compile_bytecode
ERROR:stack] return self._compile_core()
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 645, in _compile_core
ERROR:stack] res = pm.run(self.status)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 228, in run
ERROR:stack] stage()
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 583, in stage_nopython_backend
ERROR:stack] self._backend(lowerfn, objectmode=False)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 538, in _backend
ERROR:stack] lowered = lowerfn()
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 525, in backend_nopython_mode
ERROR:stack] self.flags)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/compiler.py", line 811, in native_lowering_stage
ERROR:stack] lower.lower()
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/lowering.py", line 141, in lower
ERROR:stack] self.library.add_ir_module(self.module)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/targets/codegen.py", line 158, in add_ir_module
ERROR:stack] self.add_llvm_module(ll_module)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/targets/codegen.py", line 170, in add_llvm_module
ERROR:stack] self._optimize_functions(ll_module)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/numba/targets/codegen.py", line 88, in _optimize_functions
ERROR:stack] fpm.run(func)
ERROR:stack] File "/home/l-althueser/anaconda3/envs/pax_p34/lib/python3.4/site-packages/llvmlite/binding/passmanagers.py", line 127, in run
ERROR:stack] return ffi.lib.LLVMPY_RunFunctionPassManager(self, function)
Seeing this error running within cax
in pax_v6.2.0
environment (hax v1.2.0):
Traceback (most recent call last):
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/cax-4.10.5-py3.4.egg/cax/main.py", line 131, in main
task.go(args.run)
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/cax-4.10.5-py3.4.egg/cax/task.py", line 65, in go
self.each_run()
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/cax-4.10.5-py3.4.egg/cax/tasks/process_hax.py", line 101, in each_run
self.run_doc['detector'])
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/cax-4.10.5-py3.4.egg/cax/tasks/process_hax.py", line 53, in _process_hax
init_hax(in_location, pax_version, out_location) # may initialize once only
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/cax-4.10.5-py3.4.egg/cax/tasks/process_hax.py", line 25, in init_hax
minitree_paths = [out_location])
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/hax-1.2.0-py3.4.egg/hax/__init__.py", line 66, in init
update_treemakers()
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/hax-1.2.0-py3.4.egg/hax/minitrees.py", line 123, in update_treemakers
__import__('hax.treemakers.%s' % module_name, globals=globals())
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/ROOT.py", line 301, in _importhook
return _orig_ihook( name, *args, **kwds )
File "/project/lgrandi/anaconda3/envs/pax_v6.2.0/lib/python3.4/site-packages/hax-1.2.0-py3.4.egg/hax/treemakers/trigger.py", line 6, in <module>
from hax.trigger_data import get_aqm_pulses
ImportError: cannot import name 'get_aqm_pulses'
As it says here: https://github.com/XENON1T/hax/blob/master/hax/pmt_plot.py#L8, if you use plot_on_pmts with the physical geometry, the color scale shown in the color bar applies to only one of the plots. The other plot has its own color scale, which can be different. The current workaround is to specify vmin and vmax manually, but this should be fixed.
Moreover, the API for pmt_plot leaves a lot to be desired (e.g. have to specify color and size, can't specify scalar for one and array for the other, etc.).
I found this in pmt_plot.py:
## Known issue's I'm to lazy to fix right now:
## - on physical layout, color and/or size probably not on same scale in two subplots
## unless vmin&vmax are specified explicitly.
## - Have categorical labels event if _channel present. Make digitizer obey _channel suffix convention.
but don't recall if I since fixed these :-) Worth checking at some point.
from hax import slow_control
slow_control.init_sc_interface()
Gives:
KeyError: 'sc_variable_list'
Here's a brand new bug report :)
https://gist.github.com/ErikHogenbirk/5da9801d81b253ef7717
Seems it asks for hax CONFIG, but is that still in the latest hax version?
Currently only some very basic info is queried from the db -- name, number and source. We should get several more fields for the analyst to use, such as timestamp, tags, duration, ... . There should also be a utility function to get all the info from a specific run (the complete json).
We want to start migrating corrections from pax to hax wherever possible. Since it seems like pax is skipping the x-y correction anyway, we can start with that one as a test case.
This update should set a basic framework for applying up-to-date corrections similar to how it is done in cax. This should be relatively easy for all the multiplicative area corrections.
The 'data' from hax now has duplicate column names named: 'índex' . When trying to convert pandas to numpy this gives an error that can be solved with:
# Convert dataframe to numpy array, so we don't need .values all the time
data
= data.T.groupby(level=0).first().T
data = data.to_records(index=False)
`
Probably every hax.minitrees.TreeMaker one uses adds one 'index' column. There should always only be one 'índex' column in 'data'.
Pax patch releases don't always introduce new processing functionality. Hence. you would like to be allowed to mix patch versions -- 6.1.0 and 6.1.1 for example -- by saying pax_version_policy='6.1'
.
However, currently you must either choose an exact pax version with e.g. pax_version_policy='6.1.0'
(excluding 6.1.1 datasets) or take the latest available with the default pax_version_policy='latest'
(including pax 5.x.x datasets for some runs).
When making minitrees of pax v6.1.0 data on midway with LargestPeakProperties the *_x and *_y fields are missing.
To reproduce:
import hax
hax.init(pax_version_policy='6.1.0')
data = hax.minitrees.load(2047, treemakers=['LargestPeakProperties'], force_reload=True, num_workers=1)
data.keys()
The lone_hit_*
branches are missing in some LargestPeakProperties
minitrees, causing problems in the MC workflow. Can reproduce with files on Midway in:
/project/lgrandi/pdeperio/161206-hax_debug
by running:
source activate pax_v6.1.1
HAXPYTHON="import hax; "
HAXPYTHON+="hax.init(main_data_paths=['/project/lgrandi/pdeperio/161206-hax_debug'], minitree_paths=['.'], pax_version_policy = 'loose'); "
HAXPYTHON+="hax.minitrees.load('Xenon1T_TPC_Kr83m_00000_g4mc_NEST_Patch_pax', ['LargestPeakProperties']);"; python -c "${HAXPYTHON}" # Seems OK
HAXPYTHON+="hax.minitrees.load('Xenon1T_TPC_Kr83m_00183_g4mc_NEST_Patch_pax', ['LargestPeakProperties']);"; python -c "${HAXPYTHON}" # Missing branches
producing the two files:
Xenon1T_TPC_Kr83m_00000_g4mc_NEST_Patch_pax_LargestPeakProperties.root # Seems OK
Xenon1T_TPC_Kr83m_00183_g4mc_NEST_Patch_pax_LargestPeakProperties.root # Missing branches
@tunnell and @coderdj would like this to avoid the dependency on hax in lax due to XENON1T/lax#43. As with any other change to a minitree, we should first update all minitrees using hax in a branch, then copy them over and only then merge.
Most jobs are hanging after hax completes with error:
slurmstepd-midway2-0091: error: Exceeded step memory limit at some point.
for example in this log:
/project2/lgrandi/xenon1t/cax/5803_v6.4.2/5803_v6.4.2_24432875.log
causing them to occupy the batch queue for much longer than necessary.
The minitrees were created successfully, so seems like an issue with clearing memory. Will also contact RCC for suggestions.
Right now the minitrees get created in the folder where hax is called. To keep a nice and clean work space it would be better if one could specify where hax creates the minitrees. The standard option would still be: in this folder, but the user could specify one folder that holds all the minitrees that one uses.
Hax would also need to find the minitrees in the user specified folder.
There is an option for hax.init()
to not use the runs database. However, this means that update_datasets
does not do anything so that datasets
will be None
. Perhaps I am missing something here, but it seems that this means that you can specify not to use the runs database, but if you attempt to load anything, hax will give you an error since there is no variable listing datasets.
For now, I made a workaround by building this variable using the files in the main_data_paths
, setting this as the run name. I can implement this and create a pull request, but maybe I am missing some secret configuration... Any thoughts?
I'm getting an error when loading some minitrees...
https://gist.github.com/ErikHogenbirk/38f8c047932023b0520f521b834fe42c
Hax is complaining about RuntimeError: failed to load the library for 'std::vector<Hit>' @ 9ed875416084b362
. I have seen this error with simulated data, but also in real data. Anyone have any idea what might be the problem?
mc v 0.1.7, pax 6.2.1, hax 1.3.0
We must first make the minitrees from the first dataset given to hax.minitrees.load, since the dask multiprocessing only works if it knows which variables to expect. Since these both get the force_reload option, using a single dataset with minitrees.load and force_reload causes a double remake.
Not a big issue but hints at a bigger problem: having to make the minitrees to know which variables it will produce. This is inconvenient for other reasons too (see e.g. #47).
Currently TotalProperties computes the total peak area of an event by looping over all peaks: https://github.com/XENON1T/hax/blob/master/hax/treemakers/common.py#L176
This includes "peaks" from acquisition monitor channels, in particular the analog summed bottom array, which has peaks with a non-negligible area. This is bad, because it makes a quantity like (s1+s2)/total_area dependent on whether the analog sum waveform triggered or not, and cause nonlinear behaviour near the threshold.
Implement bumpversion to keep track of versions.
Following error in a few runs causing jobs to hang:
/project/lgrandi/anaconda3/envs/pax_v6.4.2/lib/python3.4/site-packages/pandas/computation/align.py:98: RuntimeWarning: divide by zero encountered in log10
ordm = np.log10(abs(reindexer_size - term_axis_size))
in e.g. run 5414:
/project2/lgrandi/xenon1t/cax/5414_v6.4.2/5414_v6.4.2_24265319.log
When we want to do a single scatter cut, we mostly cut on the S2 size in some way. However, based on explorations in this note by Tianyu:
https://xecluster.lngs.infn.it/dokuwiki/lib/exe/fetch.php?media=xenon:xenon1t:sim:notes:jhowlett:main_singlescatter_simplified_copy_feb_20_2017.html
it seems that a lot of the second-largest S2s could just be single-electron pile-up and that they can be identified by their with. I could imagine a single scatter cut where we cut not only on the second S2 size, but also on the width. For this it would be great if a property largest_other_s2_width
could be added. It's not critical, it's just a nuisance if we want to use this parameter and have to add it by hand all the time. Let me know what you think. If you disagree I'll drop the issue.
The trigger produces a trigger data file (which will change format soon, see XENON1T/pax#343) with useful information such as the dark rate in each PMT over time. It would be useful to have some common access tools for this file. For example we may want to have a cut/TreeMaker which tells you if the PMTs contributing to the main S1 have an unusually high dark rate.
This removes lax dependency on hax. @coderdj
https://gist.github.com/ErikHogenbirk/663a3511a272c12098ce
Just pulled the latest version this afternoon, using examples/07_check_data.ipynb, only first cell.
Relevant code:
import hax # (works fine)
hax.init()
Get this error:
KeyError: 'rz_position_distortion_map'
using hax v1.4.3 with pax_v6.2.0. I guess since it's not implemented back then.
Can we try to specify version requirements for all our packages (hax, lax, cax) more stringently?
The XENON100 examples we used at the analysis workshops are out of date by now; it would be nice to distill a few new examples from some of the notes that people have published.
When you use some of the hax.raw_data functions, you often have to specify a lot of custom config options to avoid pax making a root file or erroring on event proxies. E.g.:
config = dict(pax=dict(pre_output=[], encoder_plugin=None, output='Dummy.DummyOutput'))
for event in hax.raw_data.process_events(some_run_name, config_override=config):
...
For the Muon veto this is even worse, currently the XENON1T (i.e. TPC) config is used automatically even if hax.init(detector='muon_veto'). You can work around this with e.g.
config = dict(pax=dict(pre_output=[], dsp=[], encoder_plugin=None, output='Dummy.DummyOutput'))
from copy import deepcopy
config = deepcopy(hax.utils.combine_pax_configs(mv_config, config))
for event in hax.raw_data.process_events(some_run_name, config_override=config):
...
but even this gives errors if you did not blank the dsp group. Also the use of deepcopy is tricky, if you don't pax will modify the config as it loads it (to insert the encoder plugin group) and you eventually end up in a mess.
It would be nice if there was a quick way to get hitfinder diagnostic plots (i.e. the "PMT waveforms", but also with the hitfinder's interpretation) directly in hax.
This tool is a nice one for debugging, but it will be nice if the return object of get_aqm_pulses could have an event number. For example a tuple like: return {k: ( event_number,np.array(v) ) for k, v in aqm_signals.items()}.
It would be nice to include a bit of code to extract all peaks, hits, etc. from a file, with an extra column which tells you which event they came from. This doesn't have to be very fast, as it is not a very common analysis task, though ideally it should not fry your RAM (so maybe allow for some chunking).
The peaks in particular are useful to plot in exploratory analyses (see e.g. the graphs in Erik's GXe note, even though those were only produced for the largest peaks since that's what hax currently allows you to do conveniently). The hits and pulses are useful for gain calibrations and hitfinder efficiency studies.
You can write your own code to do this of course, e.g. with `hax.paxroot.looproot; I and others have some code like that -- a nice version of this would be a good addition to hax.
as suggested by @JelleAalbers in XENON1T/cax#116 (comment)
I've used the cache_file option in minitrees.load, and I was thinking that this is maybe something you'd also want for a normal minitree. For instance, suppose you apply a bunch of cuts and you keep only a small number of events (like the NR single scatter band), and you want to share this with a collegue (or some other script if you don't have any friends). It'd be nice to be able to load and store minitrees in a standard haxy way.
I think it would be quite easy to make, if people agree that this could be useful I'll make the code and make a pull request. If not then I'll shut up about it and just use pickle for my own code :)
Seems to be a benign error running on MC (command, pax_version_policy = 'loose'
):
Could not find run number for Xenon1T_TPC_Kr83m_96.000000_g4mc_NEST_Patch_pax, got exception <class 'IndexError'>: index 0 is out of bounds for axis 0 with size 0. Setting run number to 0.
Would be nice to suppress/fix this to clean up logs and facilitate (real) error searching.
Not much more to say. This also applies to inspect_events_from_dataframe
When calling hax.raw_data.inspect_events(198,0)
I get an error about the name of the trigger info:
ValueError: Invalid file name: /xetemp/xenon1t/160425_1103/trigger_monitor_data.zip.
Should be tpcname-something-firstevent-lastevent-nevents.zip
I guess this is due to the recent addition of trigger zipped bsons?
pax 4.9.1, hax 0.2, running on xecluster06, trying to read from /xetemp/xenon1t/
I'm not really sure if this is a pax or hax issue, but I'm unable to plot from my notebook. Will try plotting directly from pax.
Input: https://gist.github.com/ErikHogenbirk/25d4f8808ce5fb70b51fcadca99f4a2a
Output: https://gist.github.com/ErikHogenbirk/33b0e735ac0677a6abdeb39912c1ca3d
It seems the data is found and read; there is a checkpulses warning at the beginning. Then it throws some ROOT error and everything goes wrong.
pax 4.8.0
hax 0.1
I tried installing hax on my xecluster account, but I got two errors:
with
git clone http://github.com/XENON1T/hax.git
I get
Initialized empty Git repository in /home/wittweg/hax/.git/
error: Failed connect to github.com:443; Operation now in progress while accessing https://github.com/XENON1T/hax.git/info/refsfatal: HTTP request failed
So I tried a workaround by just doing scp from my local pc. If I run
python setup.py develop
the installation fails with:
running develop
error: can't create or remove files in install directoryThe following error occurred while trying to add or remove files in the
installation directory:[Errno 13] Permission denied: '/archive_lngs/common/anaconda3/envs/pax440r5/lib/python3.4/site-packages/test-easy-install-16436.write-test'
The installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:/archive_lngs/common/anaconda3/envs/pax440r5/lib/python3.4/site-packages/
Perhaps your account does not have write access to this directory? If the
installation directory is a system-owned directory, you may need to sign in
as the administrator or "root" account. If you do not have administrative
access to this machine, you may wish to choose a different installation
directory, preferably one that is listed in your PYTHONPATH environment
variable.For information on other options, you may wish to consult the
documentation at:https://pythonhosted.org/setuptools/easy_install.html
Please make the appropriate changes for your system and try again.
file hax/hax/slow_control.py L176.
In the for
loop often the variable entry
is not a dict
type variable. It's expected a kind of ntupla where the entry "timestampseconds" (datetime.utcfromtimestamp(entry['timestampseconds'])
) is supposed to be present but very often this is not true.
I don't understand the reasons (some errors in the sc database entries?) I fix this problem by insert in the for
loop a check on the entry
variable type:
if not isinstance(entry,dict):
continue
This fix the problem, but it's not the only one.
Several time when I try to process the dataset with pax_v6.5.0, cax by means of this hax function try to connect to the slow_control database and only after several attempts (40-50 times) is able to connect and read the Voltage values of each PMT in the AddGains function.
After few tentatives the cax starts to process the run without error messages.
For people who want to run hax locally on their laptop without using the rundb info it would be nice to have an init option for hax to work without the db and thus without setting a mongodb password. (this works with experiment = 'XENON100', but not anymore with experiment = 'XENON1T')
Even though hax is created with analysis facilities in mind, making a small option that lets you work locally is still a very nice feature that me and others use.
When looking for the file 'xenon.root', hax successfully finds the file when the notebook is run in directory /my/dir/, but not when run in /my/dir/subdir/. There are no event class files or minitrees in either folder. Here is the traceback of the error.
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-4-5d76aadf63af> in <module>()
1 # dataset = 'radon_cut_xe100_110423_1252'
----> 2 data = hax.minitrees.load('xenon', treemakers=['Basics'], force_reload= True);
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/minitrees.py in load(datasets, treemakers, force_reload)
151 dataframes = []
152 for dataset in datasets:
--> 153 minitree_path = get(dataset, treemaker, force_reload=force_reload)
154 new_df = pd.DataFrame.from_records(root_numpy.root2rec(minitree_path))
155 dataframes.append(new_df)
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/minitrees.py in get(dataset, treemaker, force_reload)
115 # We have to make the minitree file
116 # This will raise FileNotFoundError if the root file is not found
--> 117 skimmed_data = treemaker().get_data(dataset)
118 print("Created minitree %s for dataset %s" % (treemaker.__name__, dataset))
119
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/minitrees.py in get_data(self, dataset)
44 """Return data extracted from running over dataset"""
45 loop_over_dataset(dataset, self.process_event,
---> 46 branch_selection=hax.config['basic_branches'] + list(self.extra_branches))
47 self.check_cache(force_empty=True)
48 if not hasattr(self, 'data'):
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/paxroot.py in loop_over_datasets(datasets_names, event_function, branch_selection)
84 except Exception as e:
85 rootfile.Close()
---> 86 raise e
87
88 # For backward compatibility
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/paxroot.py in loop_over_datasets(datasets_names, event_function, branch_selection)
79 t.GetEntry(event_i)
80 event = t.events
---> 81 event_function(event)
82 except StopEventLoop:
83 rootfile.Close()
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/minitrees.py in process_event(self, event)
38
39 def process_event(self, event):
---> 40 self.cache.append(self.extract_data(event))
41 self.check_cache()
42
/project/lgrandi/anaconda3/envs/pax_head/lib/python3.4/site-packages/hax-0.2-py3.4.egg/hax/treemakers/common.py in extract_data(self, event)
84 continue
85 if p.detector == 'tpc':
---> 86 peak_type = p.type
87 else:
88 # Lump all non-lone-hit veto peaks together as 'veto'
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf2 in position 0: invalid continuation byte
Why split the variables up?
OSError: File b'/Users/tunnell/Work/anaconda/envs/xenon_stack/lib/python3.4/site-packages/hax-1.3.0-py3.4.egg/hax/sc_variables.csv' does not exist
I don't think you included this file in the setup.py @JelleAalbers
It would be really nice if we could leverage the batch queue to make minitrees for us... with an easy flag like batch_queue=True or something rather than asking around who has the latest version of the script with the right qsub/bash incantations.
We need at least "last_busy_type" = on/off and "last_hev_type" = on/off.
We might also want nearest/previous type for convenience.
This should not be needed for all good data. But for old data with DAQ bug we need it to reject partial events.
Getting error:
ValueError: field 'index' occurs more than once
when using load_single_dataset
in lax.
Should this hack in load() to remove the index
column go in load_single_dataset() instead?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.