Code Monkey home page Code Monkey logo

leptonisolation's Introduction

CodeFactor

Overview

This package is a lepton-isolation classification tool, to be used for analysis of particle collision data at CERN.

The tool is able to take leptons from collision events, feed the surrounding tracks and calorimeter depositions into a neural net, and provide a number between 0 and 1 indicating likelihood of the lepton being prompt, as opposed to coming from a heavy-flavor decay.

SamplePrep

Data production is the first step of the training pipeline. In this step, we take collision data in the form of DAOD or AOD ROOT files, perform event and object filtering, and produce trees with information on each lepton and its surrounding objects. Tracks and calorimeter clusters which were used in a lepton's own reconstruction are not included in its list of associated objects.

Run this package on lxplus for access to the relevant libraries. Simply edit and source make_samples.sh in the SamplePrep/ folder in order to produce samples.

Training

Train a recurrent neural net to perform lepton isolation. Simply edit and run isolator.py.

The code in the Training/ folder can also perform hyperparameter tuning, plot production, model saving and loading, isolation variable sanity checks, and final analysis.

MiscScripts

These tools perform various diagnostic tests and checks.

Outputs

Training outputs are by default stored in /public/data/RNN/runs/, though this can be changed in the Python code.

How to display outputs when using SSH (e.g. on UIUC Skynet):

  • On local computer, map Tensorboard port 6006 on Skynet to local port 16006: ssh -N -f -L localhost:16006:skynet:6006 @skynet
  • tensorboard --logdir /public/data/RNN/runs
  • Go to http://localhost:16006 on local browser.

leptonisolation's People

Contributors

mattunderscorezhang avatar particularlypythonicbs avatar

Stargazers

 avatar

Watchers

 avatar  avatar

leptonisolation's Issues

Loader efficiency

This is an awesome repository, thanks for sharing this code somewhere where everyone can benefit from it!

I have a few comments about the efficiency of this code:

  • You could be writing the HDF5 files such that the first index in the output dataset runs over lepton candidates rather than over events (and just keeps tracks within the dR cone). This might save you some sorting later on.

    • If you like having the first index correspond to event number, there's also nothing keeping you from writing out a 3d array. You can associate the tracks to jets in the WhateverWriter::write(...) method and make the track index 2d. That said it will require a rewriting in all your variable filler functions so I can understand if you don't want to go that way.
  • You do a lot of looping over tracks in loader.py. I think you'd get orders of magnitude speed up if rather than something like this:

    for track in tracks:
        if track['pT'] < 1000: continue
        if track['nSCTHits']+track['nPixHits'] < 7: continue
        good_tracks.append(track)

    you did something like this

    ok_pt = tracks['pT'] < 1000
    ok_hits = tracks['nSCTHits']+tracks['nPixHits'] < 7
    good_tracks = tracks[ok_pt & ok_hits]

Anyway, I recognize that efficiency is secondary to getting something that actually works, I just wanted to leave this comment in case you (or someone else who contributes) wants to speed things up a bit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.