nwuhep / ntupleproducer Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 4.0 224.3 MB

Northwestern ntuplizer tools for use with CMSSW.

Home Page: https://twiki.cern.ch/twiki/bin/view/CMS/UserCodeNWUntupleProducer

C++ 70.68% Python 29.09% Objective-C 0.13% Shell 0.10%

ntupleproducer's People

Contributors

Stargazers

Watchers

Forkers

zdemirag lchaparr alver84

ntupleproducer's Issues

Unused variables in TCPhysObject class

It seems that
_isReco and the _type of the PhysObjects are never used or set.
Perhaps we may get rid of them.

High pt muons

We may want to include the high-pt muon id, according to this:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMuonId#New_Version_recommended

New HoverE2012 for electrons

According to this: https://twiki.cern.ch/twiki/bin/view/CMS/HoverE2012
we need to implement new HOverE for electrons. The photons are fine, it seems.

The GsfElectron and Photon DataFormats are updated in CMSSW_5 so that the new H/E variable can be acessed directly:

  GsfElectron::hcalOverEcalBc();
  Photon::hadTowOverEm();

Important: Note that if using the above methods, the corresponding methods must be used for detector based Hcal isolation (with HCAL footprint removal corresponding to the new H/E numerator). For electrons, a dedicated method is provided for each of two cone sizes:

  GsfElectron::dr03HcalDepth1TowerSumEtBc();
  GsfElectron::dr04HcalDepth1TowerSumEtBc();

Who is Rafael?

Who is Rafael and why does he leave a comment saying he made a change next to his changes? In a way, this is a nice idea, but if we all did this the code would be a ridiculous mess. I think we can rely on the versioning system to let us know who made what changes and when they do it.

Also, he's mixing tabs and spaces which is mildly annoying.

Check the bool for tracking POG filters

We may need to reverse the boolean for tracking filters.
From the noise filters twiki, https://twiki.cern.ch/twiki/bin/viewauth/CMS/MissingETOptionalFilters
For the three tracking POG filters, there are example configurations of the "taggingMode" in the above mentioned python file exampleICHEPrecommendation_cfg.py. One thing to be made clear is that the stored boolean for the three tracking POG filters is the opposite to what we usually have for other filters. Specifically, for the tracking POG filters, true means rejected bad events while false means good events.

Also someone have to check the effect of those filters on the most common datasets. Do they matter at all?

Possible memory leak

Zeynep claims that she was running the ntuplizer at FNAL and it failed due to excessive RAM usage. More info as I investigate. Andrey, contribute for once in your life.

More varaibles from second SuperCluster

For Dalitz analysis in the electron channel we may need to add variables from the Second Base-supercluster. Here is a snippet from Ming:

  for (CaloCluster_iterator itbc = iEle->superCluster()->clustersBegin(); itbc != iEle->superCluster()->clustersEnd(); ++itbc) {
    eleBCEn   .push_back((*itbc)->energy());
    eleBCEta  .push_back((*itbc)->eta());
    eleBCPhi  .push_back((*itbc)->phi());
    eleBCS25  .push_back(lazyTool->e2x5Max(**itbc)/lazyTool->e5x5(**itbc));
    eleBCS15  .push_back(lazyTool->e1x5(**itbc)/lazyTool->e5x5(**itbc));
    eleCov = lazyTool->localCovariances(**itbc);
    eleBCSieie.push_back(sqrt(eleCov[0]));
    eleBCSieip.push_back(eleCov[1]);
    eleBCSipip.push_back(eleCov[2]);
  }

Update readme with multicrab blurb

nate do it

Incorporate multicrab

Multicrab functionality should be implemented for ntuple producer.

https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMultiCrab

Remove MET integration

HZG->visible and HZG->monophoton have diverged enough that we can remove MET from our ntuples. They will cover MET in their fork

EGamma crystals

Need to address the issue of saving Ecal crystals. Currently, for the Photons, there are up to 100 (!) crystals saved, each contains 10 variables.
This accounts for 3% of the ntuple size of double electron sample.

Other groups that use our ntuples

Currently, besides us, we have northeastern, brown, maybe some greek ones? possibly more. Do you guys have other groups using the NWUtuples?

Add more trigger names for quarkonia

From Stoyan, these are the relevant triggers for j/psi study:

HLT_Mu15_TkMu5_Onia_*
HLT_Dimuon8_Jpsi_*
HLT_Dimuon10_Jpsi_*

In the non-parked data there are few more that may be interesting -
all with "Jpsi" in the name (but drop the "Displaced")

Not sure if MuOnia and MuOniaParked were combined (as happened
to DoubleMu* in reReco).

Figure out the electron isolation issue

Photons are not removed from pf-Isolation as Nate found out.

Implement Dump() function

We all have various dumping functions, more advanced individuals even have a special class for dumping stuff.
We should implement this functionality in TCObjects such a that we can call it directly from any TCPhysObject: obj.Dump(). Calling it from a Muon or Electron should overload the method and print more stuff related to that object.

We could have two levels of dumps, obj.Dump(lvl), where lvl controls the amount of info two dump.

Merge Nate's jet development

Merge it to master.

Skimming the ntuples

I think there should be some skimming functionality implemented in the ntuples. We could come up with a baseline selection that works for all of the relevant analyses and people can remove the skim if they happen to be interested in some special study that the skim conflicts with.

I think a good starting point is requiring at least one lepton above some pt threshold (pt_mu > 4 or pt_e > 10). On top of this could be the requirement that if none of the specified trigger paths fire, the event is not saved.

Create a better tutorial on analyzer

Started at myAnalyzer/README.md
It has to be extended to cover most important topics, so that any noobie could pick it up.

Abandon IdMap and IsoMap

Most of the variables for Electron and Photon have been included into the class definition. We need to remove IdMaps from PhysObject definition.
We may keep a map inside TCEgamma though, but even there it shouldn't be necessary.

Implement MotherID and GrandMotherID

In GenParticle class: implement two more methods:
GetMotherPDGId() and GetGrandMotherPDGId().
No need a Set method, just getters. If moter/frandmother does not exist, return 0.

Number 3

I also wanted to open an issue...
Feels good!

Create a New Object class: TCEGamma

A lot of the variables are common among Electron and Photon. We should combine them into a class.
TCElctron and TCPhoton will be inherited from TCEgamma, which would be inherited from TCPhysObject as usual.

Add pho26_pho22 with mass trigger names

For higgs dalitz study in electron channel, a few extra triggers might be useful,
that is pho26_pho22 with mass constraint.
Exact names can be found here:
https://indico.cern.ch/getFile.py/access?contribId=37&sessionId=10&resId=0&materialId=slides&confId=278923

Arbitrated Muons

Two requests from Stoyan:
muCon->SetIsTrArbitrated(muon::isGoodMuon(iMuon, muon::TrackerMuonArbitrated));
muCon->SetIsArbitrated(muon::isGoodMuon(iMuon, muon::AllArbitrated));

ConversionVeto() and convVeto variable in electrons is ambiguous

The way it is defined it should be called PassConversionVeto()
meaning that if it's true it is not conversion.

Need to set up the code for producing ntuples from 13 TeV samples

New MC samples at 13 TeV are produced in CMSSW_7X, so we need the code ready to ntuplize them.
Only relevant for those who plan to stay at CMS so long.

Add the v6.4 changes back

I got vimdiff to work with git finally (set it for git difftool, not git diff). So andrey can go ahead and use that to fix stuff!

Incrementing number in ClassDef()

Next time when we change the definition of any TObject class we should increment the number inside ClassDef().
From here: http://root.cern.ch/drupal/content/adding-your-class-root-classdef

7 TeV data

Do you guys use 7TeV data? Is anyone intending on using a version of these ntuples in 44X or on legacy 53X rerecos?

Make updates to run in CMSSW_5_3_12

New MET recipes are for CMSSW_5_3_12_patch2:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMETRecipe53X
It would get rid of one of PAT packages (it is always good)

muon variables in Ntuples; gen variables

For the Dalitz we need all the muons above 3 GeV stored or (only if all muons are making ntuple too large) at least all the (tracker OR global) muons. We need the tracker chi2/ndf and also isSoft and isTight (available muon selectors).

For GEN: I talked with Brian - we need to store the pointer (or index) of the GEN particle and the pointer of (index of) its mother. We don't need the grandmother variable.

Memory leaks in EGamma packages

Memory leaks from EGamma code affects the ntuple production on a large scale (Crab jobs crush).
The issue is posted to hypernews:
https://hypernews.cern.ch/HyperNews/CMS/get/egamma/1371.html
Yet,
Nobody is going to resolve this, probably. So we need to dig ourselves, with valgrind.

Resolve Trigger objects issues

At some point we need to address trigger objects saved in the ntuples.

Some ideas:

Instead of hlt-trigger name store a ULong64 bit that tells which trigger this object belongs to. (Should correspond to the savedTriggerNames array)
This is assuming that the same object can belong to different triggers (should check)

mva() - is there another MVA for electrons?
http://cmslxr.fnal.gov/lxr/source/DataFormats/EgammaCandidates/interface/GsfElectron.h#574

More of possible variables are here:
http://cmslxr.fnal.gov/lxr/source/DataFormats/EgammaCandidates/interface/GsfElectron.h

Update: there are also some variables saved twice: in IdMaps and by itself: like R9. Will leave only one instance.

Put SuperClusterFootprintRemoval variables into a struct

There is no reason to use a Map there, it should be a struct.