dirac-institute / classe Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 0 B

Classification Engine for Lightcurves

classe's People

Contributors

Watchers

classe's Issues

Methods + Algorithms + Buzzwords

A running list of methods + algorithms to chase up and explore:

Hidden Markov Models: Used a lot in speech recognition + reconstruction, though they might have been replaced by autoencoders now (?). They might be useful for real-time classification tasks
Autoencoders:
Active Learning: The idea is to learn with feedback (this might be especially useful for classifications aimed at follow-up, where we can give feedback to the classifier whether it gave us the right thing. The ML group at TU Dortmund is currently exploring these techniques for transient classification in neutrino experiments and VHE gamma-ray telescopes.
Generative Adversarial Networks: seem to be the ML method of the moment. They have the advantage that they can work with unlabelled data, but they might be super hard to interpret (I don't know), so perhaps not ideal for population studies.

Possibly useful papers/reviews:

review of time series classification methods: https://link.springer.com/article/10.1007/s10618-016-0483-9
review of early time series classification methods (esp useful for early follow-up): https://pdfs.semanticscholar.org/8b71/25459b6d9e7fbaac71a64cc7110d45d217d2.pdf
deep learning on multivariate time series: https://link.springer.com/article/10.1007/s11704-015-4478-2
off-the-shelf feature extraction using RNNs: https://arxiv.org/abs/1706.08838

Classification Objectives

I can think of three main objectives the light curve classification engine might have:

classification of transients for follow-up
classification of transients to find outliers
classification of transients to do population studies

All three require slightly different approaches, I think. For example, for transient follow-up, one might want to minimize the false-positive rate on certain transients, but this unlikely a good objective for basing population studies off a classifier.
The solution might be to build a system that makes acceptable trade-offs, but that requires defining what those trade-offs are and what our minimum requirements on each objective is.

Are there other objectives I've not thought of?

Pitfalls

Things we need to be aware of or worry about:

training data set: where do we get a training data set, and how similar/different is that training data set to things we see in ZTF? Can we boost our results by doing transfer learning (learning on multiple existing data sets, each with a mapping from their feature space into the ZTF feature space)?
features: What are good features? How do we define/evaluate what a "good" feature is?
imbalanced class membership: there are likely transients that appear often, and others that are rare. This changes how we need to think about our classification objective, for example if the rare transients are more valuable to find
uneven sampling of time series: always trouble
instrumental effects: Are there instrumental effects that might affect the shape of the time series?
Symmetries: Kyle Cranmer + colleagues are exploring ML algorithms that respect existing symmetries in the problem. Might be worth thinking about that.

Will need to add more as we think of them.

RR-Lyr (ab and c) classification techniques

Generate an overview of papers that classify RR-Lyr from their light curves and the techniques they use

General Overview of Transient Classification Methods

Here's an initial (very broad, quite cursory) list of mostly very recent literature in this domain, as a starting point. This is not comprehensive, so additions are welcome:

My cursory reading is that there is a lot already out there on this topic, and it seems to me that perhaps the Berkeley group around Josh Bloom has done the most comprehensive development of a transient classification system so far.

dirac-institute / classe Goto Github PK

classe's People

Contributors

Watchers

classe's Issues

Methods + Algorithms + Buzzwords

Classification Objectives

Pitfalls

RR-Lyr (ab and c) classification techniques

General Overview of Transient Classification Methods

General Transient Classification

Variable Stars

SNe

Related Fields

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent