Code Monkey home page Code Monkey logo

gridin's Introduction

GridIn: tools for running over grid

Please note:

  • This guide suppose that the framework is already installed. See https://github.com/cp3-llbb/Framework for detailed instructions
  • This guide also installs our local database SAMADhi
  • You probably want to have the access to this database: ask around!
  • This guide also installs Datasets, our repo to list the datasets to be ran on
  • The utilities in the scripts folders are copied to CMSSW/bin during the scram b, so if these utilities have been modified you need to rebuild in order to have them in your PATH

First time setup

source /nfs/soft/grid/ui_sl6/setup/grid-env.sh
source /cvmfs/cms.cern.ch/cmsset_default.sh
source /cvmfs/cms.cern.ch/crab3/crab.sh

cd <path_to_CMSSW>
cmsenv

cd ${CMSSW_BASE}/src
git clone -o upstream [email protected]:cp3-llbb/GridIn.git cp3_llbb/GridIn
git clone -o upstream [email protected]:cp3-llbb/SAMADhi.git cp3_llbb/SAMADhi
git clone -o upstream [email protected]:cp3-llbb/Datasets.git cp3_llbb/Datasets

scram b -j 4
cd ${CMSSW_BASE}/src/cp3_llbb/GridIn
source first_setup.sh

How-to

The script you'll be working with is runOnGrid.py, from the scripts folder. During the first build, this script is copied by CMSSW into the global scripts directory, which is inside PATH; you can thus access it from anywhere in the source tree

In order to run on the grid, you need 3 things:

  • First, an analyzer for the framework
  • A configuration file for this analyzer
  • A set of JSON files describing the datasets you want to run on

The first two points must be handled by you. For the last point, a set of JSON files for the commonly used datasets are already included (see inside test/datasets). The structure of the JSON file is described below

You can now run on the grid. Go to the test folder, and run

runOnGrid.py -c <Your_Configuration_File> --mc datasets/mc_TT.json datasets/mc_DY.json <datasets/...>

<Your_Configuration_File> must be substituted by the name of the configuration file, including the .py extension. You should now have a new file inside the working directory, named crab_TTJets_TuneCUETP8M1_amcatnloFXFX_25ns.py. This file is a configuration file for crab3. A file is created automatically for each dataset specified when running runOnGrid.py.

Note: By default, runOnGrid.py does not submit any jobs to the grid, it only creates the necessary files for crab. If you want to automatically submit the jobs, you can add the --submit flag when running runOnGrid.py (does not seems to work for the moment due to a crab bug).

To manually launch the jobs, use the crab submit <crab_python_file>. All the submitted tasks are stored inside the tasks folder.

Book-keeping

If the job has completed successfully, you can run

runPostCrab.py <myCrabConfigFile.py>

This will gather the needed information (number of events, code version, source dataset, ...) and insert the sample (and possibly the parent dataset if missing) in the database

JSON file format

Each dataset is stored inside a JSON file, containing at least the dataset pretty name, its path as well as the number of units per job. The meaning of units depends on the type of dataset: for data, a unit is a luminosity section. For MC, a unit is a file.

An example of JSON file is given below:

{
  "/TTJets_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM": {
    "name": "TTJets_TuneCUETP8M1_amcatnloFXFX_25ns",
    "units_per_job": 15
  }
}

It can contains any number of datasets, but by convention, only datasets belonging into the same group should be into the same file (for example, it's fine to have one file for exclusive DY datasets, but not one file for all the different TT samples). The root node must be a dictionary, where the key is the dataset path, and values are:

  • name: The pretty name of the dataset. This name is used to format the task name and the output path
  • units_per_job: For MC, the number of files processed by each job. For data, the number of luminosity section processed by each job.

For a data JSON file, an additional value is mandatory:

  • run_range: must be an array with two entries, like [1, 30], defining the range of validity of the dataset

An optional value, but highly recommended is:

  • certified_lumi_file: the path (filename or url) of the golden JSON file containing certified luminosity section. If not present, a default file will be used, presumably outdated by the time you'll run.

gridin's People

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gridin's Issues

Spring cleaning

As discussed in #101

for now cleanOutdatedFrameworkProductions.py keeps all samples from the list of tags stored in github / master branch of data/SAMADhi_doNOTdelete_whitelist.json

it would be nice to refactor things in order to keep different tags for different samples (in this use case: MC was rerun but data kept with an older tag, so we can delete the old MC prod for the whitelisted data tag)

Automatic analyzer configuration of data / mc

We like having only one config file containing a switch runOnData instead of maintaining two configs, one for data and one for mc. However this is a bit cumbersome to edit depending what we are launching on crab

The fix would be to have runOnGrid.py detect runOnData in the configuration file and switch it on or off depending on the --mc or --data flag passed to it. This would avoid runPostCrab.py saying the git repo is dirty, while technically it is a one-line change not intended to be committed/pushed.

(this feature is not necessarily wanted 'centrally', so by default is assigned to an HH analyser I guess ๐Ÿ˜„ )

Update datasets to MiniAOD v2

  • data
  • DY
  • QCD
  • SM Higgs
  • Single top
  • TTbar
  • TTV
  • VV
  • WJets
  • ggX0HH
  • ggX2HH

(I am currently working on a first round, but not everything is here yet, should be coming this week at this pace, see these slides)

Crash when running runPostCrab

Hi,

I got this problem when running runPostCrab on /home/fynu/swertz/scratch/CMSSW_7_4_10/src/cp3_llbb/GridIn/test/crab_TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns.py.
Crab output is in /storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/.

##### Get information out of the crab config file (work area, dataset, pset)
done
##### Check if the dataset exists in the database
True
##### Get info from crab (logs, outputs, report)
done
##### Check if the job processed the whole sample
done
##### Figure out the code(s) version
FWUrl= https://github.com/swertz/Framework/tree/cb910ff
AnaUrl= https://github.com/swertz/TTAnalysis/tree/a6a226f
##### Figure out the number of selected events
Traceback (most recent call last):
  File "/nfs/scratch/fynu/swertz/CMSSW_7_4_10/bin/slc6_amd64_gcc491/runPostCrab.py", line 288, in <module>
    main()
  File "/nfs/scratch/fynu/swertz/CMSSW_7_4_10/bin/slc6_amd64_gcc491/runPostCrab.py", line 223, in main
    nselected  += int(line.split()[3])
ValueError: invalid literal for int() with base 10: 'method'

I ran the same code on another sample a couple of hours ago without problems...

I added something to the script to catch the exception and print the file name where it happens:
/storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/failed/log/cmsRun_41.log.tar.gz
As well as the line in the file causing the problem:
[3] Calling produce method for unscheduled module PATJetSelector/'selectedPatJetsAK4PFCHS'

I haven't checked out the code of runPostCrab yet... Why is it looking at the output of failed jobs? In this case a segfault happened and I had to resubmit it manually. It then finished successfully, see /storage/data/cms/store/user/swertz/TT_TuneCUETP8M1_13TeV-powheg-pythia8/TT_TuneCUETP8M1_13TeV-powheg-pythia8_25ns/150923_133812/0000/log/cmsRun_41.log.tar.gz

runPostCrab: key error: 'lfn'

I just had this issue:

##### Check if the dataset exists in the database
calling das_import
****das query: [{u'dataset': [{u'name': u'/VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM', u'datatype': u'mc', u'creation_time': u'2016-03-04 13:06:34', u'nevents': 2855237, u'tag': None, u'size': 66366967059}]}]
Dataset #None:
  name: /VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM
  process: VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8
  cross-section: 1.0
  number of events: 2855237
  size on disk: 66366967059
  CMSSW release: CMSSW_7_6_3
  global tag: 76X_mcRun2_asymptotic_v12
  type (data or mc): mc
  center-of-mass energy: 13.0 TeV
  creation time (on DAS): 2016-03-04 13:06:34
  comment: 
Insert into the database? [y]|n: y
done
done

##### Get info from crab (outputs, report)
done

##### Get information from the output files
Traceback (most recent call last):
  File "/nfs/scratch/fynu/swertz/CMSSW_7_6_3_patch2/bin/slc6_amd64_gcc493/runPostCrab.py", line 368, in <module>
    main() 
  File "/nfs/scratch/fynu/swertz/CMSSW_7_6_3_patch2/bin/slc6_amd64_gcc493/runPostCrab.py", line 278, in main
    for (i, lfn) in enumerate(output_files['lfn']):
KeyError: 'lfn'
##### Get information out of the crab config file (work area, dataset, pset)
done

The dataset is in the database: http://cp3.irmp.ucl.ac.be/~delaere/level2/SAMADhi/index.php?-action=view&-table=dataset&-cursor=14&-skip=0&-limit=30&-mode=find&-sort=creation_time+asc%2C+name+asc&-recordid=dataset%3Fdataset_id%3D270&-edit=1&cmssw_release=CMSSW_7_6_3
but there is no sample attached...

Out of 20 samples I've just added it's the only one which gave this error. There was nothing wrong with the crab tasks I've picked up...

Move datasets to a standalone repository

Hi guys,

I've take the liberty to create a new repository to store our datasets files. It's not easy to do it via PR, so I already create it. Why?

We need to update the files to match the new Fall15 campaign, but to keep the old one in case we need to reprocess them at some point. Branches are perfect for that, but the tools (runPostCrab, etc.) do not depends on a particular campaign, so that would mean doing some backport / forwardport all the times.

The new repository has two branches: Spring15 and Fall15, so you just need to switch the branch to have the right datasets, without the need to change GridIn.

I've already converted all the datasets to the Fall15 campaign for data and MC. Some datasets are still missing, so I've added a script to check if the datasets in a given JSON exist or not. Current status is:

MC
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-550_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluHToWWTo2L2Nu_M125_13TeV_powheg_JHUgen_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-270_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/ST_tW_top_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_HCALDebug_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-170to300_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/ST_tW_antitop_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-350_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ— (status: PRODUCTION)
/TTZToQQ_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/ST_s-channel_4f_leptonDecays_13TeV-amcatnlo-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ— (status: PRODUCTION)
/TTTo2L2Nu_13TeV-powheg/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/WZZ_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-300_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-500_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/VBFHToWWTo2L2Nu_M125_13TeV_powheg_JHUgen_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-350_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-30to50_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/QCD_Pt-120to170_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/ST_t-channel_antitop_4f_leptonDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/TT_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/TTWJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-madspin-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-300toInf_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-700_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-20to30_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/TTWJetsToQQ_TuneCUETP8M1_13TeV-amcatnloFXFX-madspin-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/TT_Mtt-700to1000_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext1-v1/MINIAODSIM ... โœ“
/QCD_Pt-20toInf_MuEnrichedPt15_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-600_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/TT_Mtt-1000toInf_TuneCUETP8M1_13TeV-powheg-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-650_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/ST_t-channel_4f_leptonDecays_13TeV-amcatnlo-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-300_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/VVTo2L2Nu_13TeV_amcatnloFXFX_madspin_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-700_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-15to20_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-400_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-550_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-600_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-650_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-1000_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-50to80_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-800_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-400_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/HWminusJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-900_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ— (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-500_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-270_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ— (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-900_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/HWplusJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/QCD_Pt-80to120_EMEnriched_TuneCUETP8M1_13TeV_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ— (status: PRODUCTION)
/GluGluToRadionToHHTo2B2VTo2L2Nu_M-450_narrow_13TeV-madgraph/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/HZJ_HToWW_M125_13TeV_powheg_pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/TTJets_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12v1/MINIAODSIM ... โœ—
/TTZToLLNuNu_M-10_TuneCUETP8M1_13TeV-amcatnlo-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/WetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ—
/GluGluToBulkGravitonToHHTo2B2VTo2L2Nu_M-450_narrow_13TeV-madgraph/RunIISpring15MiniAODv2-74X_mcRun2_asymptotic_v2-v2/MINIAODSIM ... โœ“
/DYJetsToLL_M-10to50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
/ST_t-channel_top_4f_leptonDecays_13TeV-powheg-pythia8_TuneCUETP8M1/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM ... โœ“
Data
/DoubleEG/Run2015C_25ns-16Dec2015-v1/MINIAOD ... โœ“
/MuonEG/Run2015D-16Dec2015-v1/MINIAOD ... โœ— (status: PRODUCTION)
/DoubleMuon/Run2015D-16Dec2015-v1/MINIAOD ... โœ“
/DoubleEG/Run2015D-16Dec2015-v1/MINIAOD ... โœ— (status: PRODUCTION)
/MuonEG/Run2015C_25ns-16Dec2015-v1/MINIAOD ... โœ“
/DoubleMuon/Run2015C_25ns-16Dec2015-v1/MINIAOD ... โœ“

You can see everything here: https://github.com/cp3-llbb/Datasets

If, on the end, we prefer to keep everything here, we can just delete the other repository.

/store cleaner script

The idea would be to remove directories from /storage/data/cms/store/user/username/ if and only if:

  • the task is some 'cp3_llbb' job (do not touch your favorite btag or jetmet or tracker crab3 jobs)
  • the task is older than ~ 1 month
  • the task is not inserted in SAMADhi

to remove all these aborted or incomplete or buggy crab jobs...

--submit option will end up with crashes: Exit code 8009 - Configuration

For the record, the current --submit option of our runOnGrid.py will end up with jobs crashing, while they run fine locally

the error is something like [1], and a complete log can be found at [2]. Digging in the hypernews it seems related to a bug in the crab client API

Submitting the jobs individually (not using the multicrab-like client API features) seems to work fine

I did not look much deeper, maybe there is a way around.

[1]
----- Begin Fatal Exception 09-Jul-2015 22:38:03 KST-----------------------
An exception of category 'Configuration' occurred while
[0] Constructing the EventProcessor
[1] Constructing module: class=VersionedGsfElectronIdProducer label='egmGsfElectronIDs'
Exception Message:
Duplicate Process The process name ETM was previously used on these products.
Please modify the configuration file to use a distinct process name.
----- End Fatal Exception -------------------------------------------------

[2] /storage/data/cms/store/user/obondu/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_13TeV-madgraph/GluGluToRadionToHHTo2B2VTo2L2Nu_M-260_narrow_Asympt25ns/150709_131649/0000/failed/log/cmsRun_1.log.tar.gz

Some runPostCrab cleaning

It seems the specification of the crabCommand API changed (or maybe they just improved the documentation ๐Ÿ˜„) and we don't need the dirty workaround around getoutput and crab report to silence the terminal from displaying the result

Namely, from their example:

    # If you want crabCommand to be quiet:
    #from CRABClient.UserUtilities import setConsoleLogLevel
    #from CRABClient.ClientUtilities import LOGLEVEL_MUTE
    #setConsoleLogLevel(LOGLEVEL_MUTE)
    # With this function you can change the console log level at any time.

    # To retrieve the current crabCommand console log level:
    #from CRABClient.UserUtilities import getConsoleLogLevel
    #crabConsoleLogLevel = getConsoleLogLevel()

    # If you want to retrieve the CRAB loggers:
    #from CRABClient.UserUtilities import getLoggers
    #crabLoggers = getLoggers()

It would avoid the silent crashes of runPostCrab.py that hide in fact crab crashes...

Remove date from dataset name

We should remove the date from the sample name and append it automatically from the submit script if it's needed.

Using time stamp with RunPostCrab?

Dear all,

Shall we consider the possibility to add a time stamp (in addition to the tag related to the code version) when running runPostCrab to distinguish two prods made with the same code version? It seems that at the moment it is not possible (is it?).

EDIT:

Actually I think there is a bug. If you run twice with the same version of the code, then the db indicates something strange. Example with a first run on Dec 8th, then a second today (11). As you can see, the first line of the entry is well updated, but the repository is not. It still points to the one created on Dec 8th.

Sample #791 (created on 2015-12-11 21:58:43 by simon):
name: DoubleMuon_Run2015D-PromptReco-v4_2015-10-20_v1.1.0+7415-11-g330707b_ZAAnalysis_1694c06
path: /storage/data/cms/store/user/sdevissc/DoubleMuon/DoubleMuon_Run2015D-PromptReco-v4_2015-10-20/151208_213421/0000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.