I think the way parameter files (e.g. calibration) are handled in FairRoot/R3BRoot is terrible.
Custom classes inheriting from TObject are painful enough when you are dealing with TClonesArrays (where one could argue that the benefits in speed and size justify the pains of ROOT), but for the parameter files they are a terrible design choice. A calibration just for califa (payload perhaps 40kB worth of ascii encoded float) goes for 250kB in the par format (all serialized arrays with at least 2432 elements, most of it stuff like APD numbers) or more than 1MB (with ROOT, most of it histograms).
Naturally, FairRoot does no collision checking when the same object in multiple *par files (it seems the earlier one in the TList will win), and at least in R3BRoot, we also do not even check for the existence of input parameters: R3BCalifaMap2CrystalCal will happily read a (default-constructed?) zero array from FairRuntimeDB if the no files were specified. This of course violates "Fail as early as possible".
I want to propose a solution offering the following features:
- Multi-Format support, including (potentially) human readable ones used by people which have not been exposed to ROOT (such as csv, json, xml)
- Good support for mixing and replacing individual parameter sets (e.g. "Take the califa cal from s444 with the TofD cal from s494")
- Enforcement of parameter existence
- Enforcement of unique definitions
- Namespace support
- Possibility of version control
I'm am talking, of course, of this new and shiny feature called "directory" (or "folder" for Apple users) in the recent innovation called "hierarchical file system". Furthermore, I propose a single environment variable $R3BPARS, which will point to a parameter directory.
So R3BCalifaMap2CrystalCal might just look for $R3BPARS/califa/calibration.json (or .csv, or .xml) and parse that, easy-peasy. If it turns out that 3d models stored in root files are really the way to go (instead of procedurally generating everything at startup a la geant), we could easily accommodate them as well. Backwards compatibility is easily achieved: feed $R3BPARS/par/.par and $R3BPARS/root/.root to R3BRuntimeDB.
By having a dedicated class (call it R3BParManager) or something managing parameter file lookup (e.g. pasting env var + subpath), we could later also support e.g. validity time intervals for parameter files to transparently support multiple beamtimes. (e.g. take the first wrts, and search in the subdirectories which are valid for that time for the subpath. If you find exactly one match, return that, otherwise, error. In the end, this will just return a filename: R3BParManager::get("califa/calibration.json") for epoch 1591741032 resolves to $R3BPARS/202105_s494/califa/calibration.json.)
Thus, this will be much more legible than the present approach. For comparison, here is the relevant output of FairRuntimeDB::print if I used a TList for both the first and the second input (after initialization, just before starting the run)
-------------- actual containers in runtime database -------------------------
FairBaseParSet class for parameter io
FairGeoParSet class for Geo parameter
califaCrystalCalPar Califa Calibration Parameters
CalifaTotCalPar Califa Tot Cal parameters
-------------- runs, versions ------------------------------------------------
run id
container 1st-inp 2nd-inp output
run: 5
FairBaseParSet -1 -1 0
FairGeoParSet -1 -1 0
califaCrystalCalPar -1 -1 0
CalifaTotCalPar -1 -1 0
-------------- input/output --------------------------------------------------
first Input:
Ascii I/O /what/ever/all_101442.par is open
detector I/Os: FairGenericParIo
second Input:
OBJ: FairParRootFile allParams_20211118_205834.root : 0 at: 0x5637eaceadf0
Root file I/O allParams_20211118_205834.root is open
detector I/Os: FairGenericParIo
output: none
The only relevant information (apart from the fact that your directories get spammed with all* files) here is that both TLists somewhere included some FairGenericParIo, and the names of the different parameters used. (Or the names of their classes, rather hard to tell).
Not visible:
- What input files were used
- Which input files define which parameter names (of which classes)
- If there were collisions, which files definition won? (This -1 would indicate "not present", I guess, but actually califaCrystalCalPar was taken from one of the input files.)
One perceived downside of this proposal is that different people have different formats they like and want to use. So different detectors will pick different formats, dragging in software requirements for parsing xml, json, bmp, sqlite, ods and so forth. I think this argument would be more valid if the current standard was less painful. e.g. if I were an xml fanboi and wanted to write my parameter file in xml when all other detectors used json, or vice versa. Still, I think that making nlohmann/json or zeux/pugixml a subrepository of R3BRoot is probably painful enough that it is unlikely that many people will go through the troubles of adding another option for parsing structured text if there is already one available. Besides, $SIMPATH is already 7.2GiB.
In my experience, stuff from ROOT gets used by FairRoot regardless of alternatives (like FairParFooFile::open taking a TList of TObjString instead of a STL vector or std::string), and anything from FairRoot gets used by R3BRoot the same way. I can only speak for myself, but I am confident that neither the ROOT devs nor the FairRoot devs have much in the way of kompromat on me. We as devs have the ability to pick stuff we like from FairRoot/ROOT and replace stuff we don't like from the wider free software world outside HEP.
So if I get around to fixing the califa calibration format, I will simply skip FairRuntimeDB for something which feels less painful.
PS: the automatic histogram creation based on axis definitions is working nicely, I will probably make a PR after I finish writing my PhD and just make cmake compile all of that conditionally on GCC_VERSION>=10 (or equivalent clang).