cp3-llbb / gridin Goto Github PK

View Code? Open in Web Editor NEW

0.0 16.0 10.0 154 KB

Tools and utilities to launch the cp3-llbb framework over the grid

License: MIT License

Shell 4.68% Python 95.32%

gridin's Introduction

GridIn: tools for running over grid

Please note:

This guide suppose that the framework is already installed. See https://github.com/cp3-llbb/Framework for detailed instructions
This guide also installs our local database SAMADhi
You probably want to have the access to this database: ask around!
This guide also installs Datasets, our repo to list the datasets to be ran on
The utilities in the scripts folders are copied to CMSSW/bin during the scram b, so if these utilities have been modified you need to rebuild in order to have them in your PATH

First time setup

source /nfs/soft/grid/ui_sl6/setup/grid-env.sh
source /cvmfs/cms.cern.ch/cmsset_default.sh
source /cvmfs/cms.cern.ch/crab3/crab.sh

cd <path_to_CMSSW>
cmsenv

cd ${CMSSW_BASE}/src
git clone -o upstream [email protected]:cp3-llbb/GridIn.git cp3_llbb/GridIn
git clone -o upstream [email protected]:cp3-llbb/SAMADhi.git cp3_llbb/SAMADhi
git clone -o upstream [email protected]:cp3-llbb/Datasets.git cp3_llbb/Datasets

scram b -j 4
cd ${CMSSW_BASE}/src/cp3_llbb/GridIn
source first_setup.sh

How-to

The script you'll be working with is runOnGrid.py, from the scripts folder. During the first build, this script is copied by CMSSW into the global scripts directory, which is inside PATH; you can thus access it from anywhere in the source tree

In order to run on the grid, you need 3 things:

First, an analyzer for the framework
A configuration file for this analyzer
A set of JSON files describing the datasets you want to run on

The first two points must be handled by you. For the last point, a set of JSON files for the commonly used datasets are already included (see inside test/datasets). The structure of the JSON file is described below

You can now run on the grid. Go to the test folder, and run

runOnGrid.py -c <Your_Configuration_File> --mc datasets/mc_TT.json datasets/mc_DY.json <datasets/...>

<Your_Configuration_File> must be substituted by the name of the configuration file, including the .py extension. You should now have a new file inside the working directory, named crab_TTJets_TuneCUETP8M1_amcatnloFXFX_25ns.py. This file is a configuration file for crab3. A file is created automatically for each dataset specified when running runOnGrid.py.

Note: By default, runOnGrid.py does not submit any jobs to the grid, it only creates the necessary files for crab. If you want to automatically submit the jobs, you can add the --submit flag when running runOnGrid.py (does not seems to work for the moment due to a crab bug).

To manually launch the jobs, use the crab submit <crab_python_file>. All the submitted tasks are stored inside the tasks folder.

Book-keeping

If the job has completed successfully, you can run

runPostCrab.py <myCrabConfigFile.py>

This will gather the needed information (number of events, code version, source dataset, ...) and insert the sample (and possibly the parent dataset if missing) in the database

JSON file format

Each dataset is stored inside a JSON file, containing at least the dataset pretty name, its path as well as the number of units per job. The meaning of units depends on the type of dataset: for data, a unit is a luminosity section. For MC, a unit is a file.

An example of JSON file is given below:

{
  "/TTJets_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM": {
    "name": "TTJets_TuneCUETP8M1_amcatnloFXFX_25ns",
    "units_per_job": 15
  }
}

It can contains any number of datasets, but by convention, only datasets belonging into the same group should be into the same file (for example, it's fine to have one file for exclusive DY datasets, but not one file for all the different TT samples). The root node must be a dictionary, where the key is the dataset path, and values are:

name: The pretty name of the dataset. This name is used to format the task name and the output path
units_per_job: For MC, the number of files processed by each job. For data, the number of luminosity section processed by each job.

For a data JSON file, an additional value is mandatory:

run_range: must be an array with two entries, like [1, 30], defining the range of validity of the dataset

An optional value, but highly recommended is:

certified_lumi_file: the path (filename or url) of the golden JSON file containing certified luminosity section. If not present, a default file will be used, presumably outdated by the time you'll run.

gridin's People

Watchers

Forkers

olivierbondu blinkseb swertz brieucf alexandremertens mdelcourt pieterdavid alesaggio vischia jnsandhya

gridin's Issues

Spring cleaning

As discussed in #101

for now cleanOutdatedFrameworkProductions.py keeps all samples from the list of tags stored in github / master branch of data/SAMADhi_doNOTdelete_whitelist.json

it would be nice to refactor things in order to keep different tags for different samples (in this use case: MC was rerun but data kept with an older tag, so we can delete the old MC prod for the whitelisted data tag)

Add powheg ttbar sample

Update JSON for Run2015B to use MiniAOD rereco

Change the DAS path from /X/Run2015B-PromptReco-v1/MINIAOD to /X/Run2015B-17Jul2015-v1/MINIAOD

DoubleEG Re-reco

See https://indico.cern.ch/event/472017/contribution/0/attachments/1206049/1757467/15-12-16_News_PPD.pdf: there's been a reprocessing of run2015D DoubleEG & SingleElectron datasets with a new ECAL calibration.
https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=dataset+dataset%3D%2F*%2F*2015D*04Dec2015*%2F*+status%3D*
(post-it to remember it for the next prod)

Update readme

after #86

Automatic analyzer configuration of data / mc

We like having only one config file containing a switch runOnData instead of maintaining two configs, one for data and one for mc. However this is a bit cumbersome to edit depending what we are launching on crab

The fix would be to have runOnGrid.py detect runOnData in the configuration file and switch it on or off depending on the --mc or --data flag passed to it. This would avoid runPostCrab.py saying the git repo is dirty, while technically it is a one-line change not intended to be committed/pushed.

(this feature is not necessarily wanted 'centrally', so by default is assigned to an HH analyser I guess 😄 )

Update datasets to MiniAOD v2

(I am currently working on a first round, but not everything is here yet, should be coming this week at this pace, see these slides)

Add analyzer name in the sample name string

so that we can browse more easily which samples are useful for each individual

any objection ?

Add sum(w) into the database

Crash when running runPostCrab

Hi,

I got this problem when running runPostCrab on /home/fynu/swertz/scratch/CMSSW_7_4_10/src/cp3_llbb/GridIn/test/crab_TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns.py.
Crab output is in /storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/.

##### Get information out of the crab config file (work area, dataset, pset)
done
##### Check if the dataset exists in the database
True
##### Get info from crab (logs, outputs, report)
done
##### Check if the job processed the whole sample
done
##### Figure out the code(s) version
FWUrl= https://github.com/swertz/Framework/tree/cb910ff
AnaUrl= https://github.com/swertz/TTAnalysis/tree/a6a226f
##### Figure out the number of selected events
Traceback (most recent call last):
  File "/nfs/scratch/fynu/swertz/CMSSW_7_4_10/bin/slc6_amd64_gcc491/runPostCrab.py", line 288, in <module>
    main()
  File "/nfs/scratch/fynu/swertz/CMSSW_7_4_10/bin/slc6_amd64_gcc491/runPostCrab.py", line 223, in main
    nselected  += int(line.split()[3])
ValueError: invalid literal for int() with base 10: 'method'

I ran the same code on another sample a couple of hours ago without problems...

I added something to the script to catch the exception and print the file name where it happens:
/storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/failed/log/cmsRun_41.log.tar.gz
As well as the line in the file causing the problem:
[3] Calling produce method for unscheduled module PATJetSelector/'selectedPatJetsAK4PFCHS'

I haven't checked out the code of runPostCrab yet... Why is it looking at the output of failed jobs? In this case a segfault happened and I had to resubmit it manually. It then finished successfully, see /storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/log/cmsRun_41.log.tar.gz

runPostCrab: key error: 'lfn'

I just had this issue:

##### Check if the dataset exists in the database
calling das_import
****das query: [{u'dataset': [{u'name': u'/VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM', u'datatype': u'mc', u'creation_time': u'2016-03-04 13:06:34', u'nevents': 2855237, u'tag': None, u'size': 66366967059}]}]
Dataset #None:
  name: /VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM
  process: VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8
  cross-section: 1.0
  number of events: 2855237
  size on disk: 66366967059
  CMSSW release: CMSSW_7_6_3
  global tag: 76X_mcRun2_asymptotic_v12
  type (data or mc): mc
  center-of-mass energy: 13.0 TeV
  creation time (on DAS): 2016-03-04 13:06:34
  comment: 
Insert into the database? [y]|n: y
done
done

##### Get info from crab (outputs, report)
done

##### Get information from the output files
Traceback (most recent call last):
  File "/nfs/scratch/fynu/swertz/CMSSW_7_6_3_patch2/bin/slc6_amd64_gcc493/runPostCrab.py", line 368, in <module>
    main() 
  File "/nfs/scratch/fynu/swertz/CMSSW_7_6_3_patch2/bin/slc6_amd64_gcc493/runPostCrab.py", line 278, in main
    for (i, lfn) in enumerate(output_files['lfn']):
KeyError: 'lfn'
##### Get information out of the crab config file (work area, dataset, pset)
done

The dataset is in the database: http://cp3.irmp.ucl.ac.be/~delaere/level2/SAMADhi/index.php?-action=view&-table=dataset&-cursor=14&-skip=0&-limit=30&-mode=find&-sort=creation_time+asc%2C+name+asc&-recordid=dataset%3Fdataset_id%3D270&-edit=1&cmssw_release=CMSSW_7_6_3
but there is no sample attached...

Out of 20 samples I've just added it's the only one which gave this error. There was nothing wrong with the crab tasks I've picked up...

Move datasets to a standalone repository

Hi guys,

I've take the liberty to create a new repository to store our datasets files. It's not easy to do it via PR, so I already create it. Why?

We need to update the files to match the new Fall15 campaign, but to keep the old one in case we need to reprocess them at some point. Branches are perfect for that, but the tools (runPostCrab, etc.) do not depends on a particular campaign, so that would mean doing some backport / forwardport all the times.

The new repository has two branches: Spring15 and Fall15, so you just need to switch the branch to have the right datasets, without the need to change GridIn.

I've already converted all the datasets to the Fall15 campaign for data and MC. Some datasets are still missing, so I've added a script to check if the datasets in a given JSON exist or not. Current status is:

MC

/GluGluToRadionToHHTo2B2VTo2L2Nu_M-550_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluHToWWTo2L2Nu_M125_13TeV_powheg_JHUgen_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-270_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/ST_tW_top_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_HCALDebug_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-170to300_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/ST_tW_antitop_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-350_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗ (status: PRODUCTION)
/TTZToQQ_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/ST_s-channel_4f_leptonDecays_13TeV-amcatnlo-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗ (status: PRODUCTION)
/TTTo2L2Nu_13TeV-powheg/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/WZZ_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-300_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-500_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/VBFHToWWTo2L2Nu_M125_13TeV_powheg_JHUgen_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-350_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-30to50_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/QCD_Pt-120to170_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/ST_t-channel_antitop_4f_leptonDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/TT_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/TTWJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-madspin-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-300toInf_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-700_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-20to30_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/TTWJetsToQQ_TuneCUETP8M1_13TeV-amcatnloFXFX-madspin-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/TT_Mtt-700to1000_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/MINIAODSIM ... ✓
/QCD_Pt-20toInf_MuEnrichedPt15_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-600_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/TT_Mtt-1000toInf_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-650_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/ST_t-channel_4f_leptonDecays_13TeV-amcatnlo-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-300_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-700_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-15to20_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-400_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-550_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-600_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-650_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-1000_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-50to80_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-400_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/HWminusJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-900_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗ (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-500_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-270_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗ (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-900_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/HWplusJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/QCD_Pt-80to120_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗ (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-450_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/HZJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/TTJets_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12v1/MINIAODSIM ... ✗
/TTZToLLNuNu_M-10_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/WetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✗
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-450_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... ✓
/DYJetsToLL_M-10to50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓
/ST_t-channel_top_4f_leptonDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... ✓

Data

/DoubleEG/Run2015C_25ns-16Dec2015-v1/MINIAOD ... ✓
/MuonEG/Run2015D-16Dec2015-v1/MINIAOD ... ✗ (status: PRODUCTION)
/DoubleMuon/Run2015D-16Dec2015-v1/MINIAOD ... ✓
/DoubleEG/Run2015D-16Dec2015-v1/MINIAOD ... ✗ (status: PRODUCTION)
/MuonEG/Run2015C_25ns-16Dec2015-v1/MINIAOD ... ✓
/DoubleMuon/Run2015C_25ns-16Dec2015-v1/MINIAOD ... ✓

You can see everything here: https://github.com/cp3-llbb/Datasets

If, on the end, we prefer to keep everything here, we can just delete the other repository.

/store cleaner script

The idea would be to remove directories from /storage/data/cms/store/user/username/ if and only if:

the task is some 'cp3_llbb' job (do not touch your favorite btag or jetmet or tracker crab3 jobs)
the task is older than ~ 1 month
the task is not inserted in SAMADhi

to remove all these aborted or incomplete or buggy crab jobs...

runPostCrab should not assume the user has forked

Discovered by @mdelcourt, see here

--submit option will end up with crashes: Exit code 8009 - Configuration

For the record, the current --submit option of our runOnGrid.py will end up with jobs crashing, while they run fine locally

the error is something like [1], and a complete log can be found at [2]. Digging in the hypernews it seems related to a bug in the crab client API

Submitting the jobs individually (not using the multicrab-like client API features) seems to work fine

I did not look much deeper, maybe there is a way around.

[1]
----- Begin Fatal Exception 09-Jul-2015 22:38:03 KST-----------------------
An exception of category 'Configuration' occurred while
[0] Constructing the EventProcessor
[1] Constructing module: class=VersionedGsfElectronIdProducer label='egmGsfElectronIDs'
Exception Message:
Duplicate Process The process name ETM was previously used on these products.
Please modify the configuration file to use a distinct process name.
----- End Fatal Exception -------------------------------------------------

[2] /storage/data/cms/store/user/obondu/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_Asympt25ns/150709_131649/0000/failed/log/cmsRun_1.log.tar.gz

Some runPostCrab cleaning

It seems the specification of the crabCommand API changed (or maybe they just improved the documentation 😄) and we don't need the dirty workaround around getoutput and crab report to silence the terminal from displaying the result

Namely, from their example:

    # If you want crabCommand to be quiet:
    #from CRABClient.UserUtilities import setConsoleLogLevel
    #from CRABClient.ClientUtilities import LOGLEVEL_MUTE
    #setConsoleLogLevel(LOGLEVEL_MUTE)
    # With this function you can change the console log level at any time.

    # To retrieve the current crabCommand console log level:
    #from CRABClient.UserUtilities import getConsoleLogLevel
    #crabConsoleLogLevel = getConsoleLogLevel()

    # If you want to retrieve the CRAB loggers:
    #from CRABClient.UserUtilities import getLoggers
    #crabLoggers = getLoggers()

It would avoid the silent crashes of runPostCrab.py that hide in fact crab crashes...

Remove date from dataset name

We should remove the date from the sample name and append it automatically from the submit script if it's needed.

Pick up the name of the output file from the input config file

Currently hard-coded at https://github.com/cp3-llbb/GridIn/blob/master/python/default_crab_config.py#L23

it should be overwritten by the input config file if the case happen

Speed up runPostCrab.py

It is awfully too slow for big samples (e.g., TTbar)

Cleaning not attempted if there is no job number 1

reported by @swertz

see https://github.com/cp3-llbb/GridIn/blob/master/scripts/cleanOldFrameworkTasksNotInDB.py#L174
and https://github.com/cp3-llbb/GridIn/blob/master/scripts/cleanOldFrameworkTasksNotInDB.py#L187

extra u in the python config file

config.JobType.pyCfgParams = [u'era=50ns', 'process=RECO']

the "u" should be removed

Add wild card support in search_SAMADhi

Using time stamp with RunPostCrab?

Dear all,

Shall we consider the possibility to add a time stamp (in addition to the tag related to the code version) when running runPostCrab to distinguish two prods made with the same code version? It seems that at the moment it is not possible (is it?).

EDIT:

Actually I think there is a bug. If you run twice with the same version of the code, then the db indicates something strange. Example with a first run on Dec 8th, then a second today (11). As you can see, the first line of the entry is well updated, but the repository is not. It still points to the one created on Dec 8th.

Sample #791 (created on 2015-12-11 21:58:43 by simon):
name: DoubleMuon_Run2015D-PromptReco-v4_2015-10-20_v1.1.0+7415-11-g330707b_ZAAnalysis_1694c06
path: /storage/data/cms/store/user/sdevissc/DoubleMuon/DoubleMuon_Run2015D-PromptReco-v4_2015-10-20/151208_213421/0000