nwuhep / ntupleproducer Goto Github PK
View Code? Open in Web Editor NEWNorthwestern ntuplizer tools for use with CMSSW.
Home Page: https://twiki.cern.ch/twiki/bin/view/CMS/UserCodeNWUntupleProducer
Northwestern ntuplizer tools for use with CMSSW.
Home Page: https://twiki.cern.ch/twiki/bin/view/CMS/UserCodeNWUntupleProducer
It seems that
_isReco and the _type of the PhysObjects are never used or set.
Perhaps we may get rid of them.
We may want to include the high-pt muon id, according to this:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMuonId#New_Version_recommended
According to this: https://twiki.cern.ch/twiki/bin/view/CMS/HoverE2012
we need to implement new HOverE for electrons. The photons are fine, it seems.
The GsfElectron and Photon DataFormats are updated in CMSSW_5 so that the new H/E variable can be acessed directly:
GsfElectron::hcalOverEcalBc();
Photon::hadTowOverEm();
Important: Note that if using the above methods, the corresponding methods must be used for detector based Hcal isolation (with HCAL footprint removal corresponding to the new H/E numerator). For electrons, a dedicated method is provided for each of two cone sizes:
GsfElectron::dr03HcalDepth1TowerSumEtBc();
GsfElectron::dr04HcalDepth1TowerSumEtBc();
Who is Rafael and why does he leave a comment saying he made a change next to his changes? In a way, this is a nice idea, but if we all did this the code would be a ridiculous mess. I think we can rely on the versioning system to let us know who made what changes and when they do it.
Also, he's mixing tabs and spaces which is mildly annoying.
We may need to reverse the boolean for tracking filters.
From the noise filters twiki, https://twiki.cern.ch/twiki/bin/viewauth/CMS/MissingETOptionalFilters
For the three tracking POG filters, there are example configurations of the "taggingMode" in the above mentioned python file exampleICHEPrecommendation_cfg.py. One thing to be made clear is that the stored boolean for the three tracking POG filters is the opposite to what we usually have for other filters. Specifically, for the tracking POG filters, true means rejected bad events while false means good events.
Also someone have to check the effect of those filters on the most common datasets. Do they matter at all?
Zeynep claims that she was running the ntuplizer at FNAL and it failed due to excessive RAM usage. More info as I investigate. Andrey, contribute for once in your life.
For Dalitz analysis in the electron channel we may need to add variables from the Second Base-supercluster. Here is a snippet from Ming:
for (CaloCluster_iterator itbc = iEle->superCluster()->clustersBegin(); itbc != iEle->superCluster()->clustersEnd(); ++itbc) {
eleBCEn .push_back((*itbc)->energy());
eleBCEta .push_back((*itbc)->eta());
eleBCPhi .push_back((*itbc)->phi());
eleBCS25 .push_back(lazyTool->e2x5Max(**itbc)/lazyTool->e5x5(**itbc));
eleBCS15 .push_back(lazyTool->e1x5(**itbc)/lazyTool->e5x5(**itbc));
eleCov = lazyTool->localCovariances(**itbc);
eleBCSieie.push_back(sqrt(eleCov[0]));
eleBCSieip.push_back(eleCov[1]);
eleBCSipip.push_back(eleCov[2]);
}
nate do it
Multicrab functionality should be implemented for ntuple producer.
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMultiCrab
HZG->visible and HZG->monophoton have diverged enough that we can remove MET from our ntuples. They will cover MET in their fork
Need to address the issue of saving Ecal crystals. Currently, for the Photons, there are up to 100 (!) crystals saved, each contains 10 variables.
This accounts for 3% of the ntuple size of double electron sample.
Currently, besides us, we have northeastern, brown, maybe some greek ones? possibly more. Do you guys have other groups using the NWUtuples?
From Stoyan, these are the relevant triggers for j/psi study:
HLT_Mu15_TkMu5_Onia_*
HLT_Dimuon8_Jpsi_*
HLT_Dimuon10_Jpsi_*
In the non-parked data there are few more that may be interesting -
all with "Jpsi" in the name (but drop the "Displaced")
Not sure if MuOnia and MuOniaParked were combined (as happened
to DoubleMu* in reReco).
Photons are not removed from pf-Isolation as Nate found out.
We all have various dumping functions, more advanced individuals even have a special class for dumping stuff.
We should implement this functionality in TCObjects such a that we can call it directly from any TCPhysObject: obj.Dump(). Calling it from a Muon or Electron should overload the method and print more stuff related to that object.
We could have two levels of dumps, obj.Dump(lvl), where lvl controls the amount of info two dump.
Merge it to master.
I think there should be some skimming functionality implemented in the ntuples. We could come up with a baseline selection that works for all of the relevant analyses and people can remove the skim if they happen to be interested in some special study that the skim conflicts with.
I think a good starting point is requiring at least one lepton above some pt threshold (pt_mu > 4 or pt_e > 10). On top of this could be the requirement that if none of the specified trigger paths fire, the event is not saved.
Started at myAnalyzer/README.md
It has to be extended to cover most important topics, so that any noobie could pick it up.
Most of the variables for Electron and Photon have been included into the class definition. We need to remove IdMaps from PhysObject definition.
We may keep a map inside TCEgamma though, but even there it shouldn't be necessary.
In GenParticle class: implement two more methods:
GetMotherPDGId() and GetGrandMotherPDGId().
No need a Set method, just getters. If moter/frandmother does not exist, return 0.
I also wanted to open an issue...
Feels good!
A lot of the variables are common among Electron and Photon. We should combine them into a class.
TCElctron and TCPhoton will be inherited from TCEgamma, which would be inherited from TCPhysObject as usual.
For higgs dalitz study in electron channel, a few extra triggers might be useful,
that is pho26_pho22 with mass constraint.
Exact names can be found here:
https://indico.cern.ch/getFile.py/access?contribId=37&sessionId=10&resId=0&materialId=slides&confId=278923
Two requests from Stoyan:
muCon->SetIsTrArbitrated(muon::isGoodMuon(iMuon, muon::TrackerMuonArbitrated));
muCon->SetIsArbitrated(muon::isGoodMuon(iMuon, muon::AllArbitrated));
The way it is defined it should be called PassConversionVeto()
meaning that if it's true it is not conversion.
New MC samples at 13 TeV are produced in CMSSW_7X, so we need the code ready to ntuplize them.
Only relevant for those who plan to stay at CMS so long.
I got vimdiff to work with git finally (set it for git difftool, not git diff). So andrey can go ahead and use that to fix stuff!
Next time when we change the definition of any TObject class we should increment the number inside ClassDef().
From here: http://root.cern.ch/drupal/content/adding-your-class-root-classdef
Do you guys use 7TeV data? Is anyone intending on using a version of these ntuples in 44X or on legacy 53X rerecos?
New MET recipes are for CMSSW_5_3_12_patch2:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMETRecipe53X
It would get rid of one of PAT packages (it is always good)
For the Dalitz we need all the muons above 3 GeV stored or (only if all muons are making ntuple too large) at least all the (tracker OR global) muons. We need the tracker chi2/ndf and also isSoft and isTight (available muon selectors).
For GEN: I talked with Brian - we need to store the pointer (or index) of the GEN particle and the pointer of (index of) its mother. We don't need the grandmother variable.
Memory leaks from EGamma code affects the ntuple production on a large scale (Crab jobs crush).
The issue is posted to hypernews:
https://hypernews.cern.ch/HyperNews/CMS/get/egamma/1371.html
Yet,
Nobody is going to resolve this, probably. So we need to dig ourselves, with valgrind.
At some point we need to address trigger objects saved in the ntuples.
Some ideas:
Will move to non-triggering MVA, as used in HZZ4l:
https://twiki.cern.ch/twiki/bin/view/CMS/MultivariateElectronIdentification#Non_triggering_MVA
This is currently trained and verified for jan22 ReReco, our mva is custom and out of date.
We need to have NO cuts in signal MC samples. It is better to have the samples
for different channels (mu, e) separated - it is easy to mix, not so easy to "un-mix".
E1x3 is actually e3x1, need to fix to it. (both should read e1x3, but i have lysdexia)
Such as:
eSeedClusterOverP
math::XYZPointF trackPositionAtVtx()
float e5x5 (why isn't it there already!?)
Conversion variables:
float convDist ; // distance to the conversion partner
float convDcot ; // difference of cot(angle) with the conversion partner track
float convRadius; //signed conversion radius
mva() - is there another MVA for electrons?
http://cmslxr.fnal.gov/lxr/source/DataFormats/EgammaCandidates/interface/GsfElectron.h#574
More of possible variables are here:
http://cmslxr.fnal.gov/lxr/source/DataFormats/EgammaCandidates/interface/GsfElectron.h
Update: there are also some variables saved twice: in IdMaps and by itself: like R9. Will leave only one instance.
There is no reason to use a Map there, it should be a struct.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.