Code Monkey home page Code Monkey logo

classe's People

Contributors

connolly avatar

Watchers

 avatar  avatar

classe's Issues

Methods + Algorithms + Buzzwords

A running list of methods + algorithms to chase up and explore:

  • Hidden Markov Models: Used a lot in speech recognition + reconstruction, though they might have been replaced by autoencoders now (?). They might be useful for real-time classification tasks
  • Autoencoders:
  • Active Learning: The idea is to learn with feedback (this might be especially useful for classifications aimed at follow-up, where we can give feedback to the classifier whether it gave us the right thing. The ML group at TU Dortmund is currently exploring these techniques for transient classification in neutrino experiments and VHE gamma-ray telescopes.
  • Generative Adversarial Networks: seem to be the ML method of the moment. They have the advantage that they can work with unlabelled data, but they might be super hard to interpret (I don't know), so perhaps not ideal for population studies.

Possibly useful papers/reviews:

Classification Objectives

I can think of three main objectives the light curve classification engine might have:

  • classification of transients for follow-up
  • classification of transients to find outliers
  • classification of transients to do population studies

All three require slightly different approaches, I think. For example, for transient follow-up, one might want to minimize the false-positive rate on certain transients, but this unlikely a good objective for basing population studies off a classifier.
The solution might be to build a system that makes acceptable trade-offs, but that requires defining what those trade-offs are and what our minimum requirements on each objective is.

Are there other objectives I've not thought of?

Pitfalls

Things we need to be aware of or worry about:

  • training data set: where do we get a training data set, and how similar/different is that training data set to things we see in ZTF? Can we boost our results by doing transfer learning (learning on multiple existing data sets, each with a mapping from their feature space into the ZTF feature space)?
  • features: What are good features? How do we define/evaluate what a "good" feature is?
  • imbalanced class membership: there are likely transients that appear often, and others that are rare. This changes how we need to think about our classification objective, for example if the rare transients are more valuable to find
  • uneven sampling of time series: always trouble
  • instrumental effects: Are there instrumental effects that might affect the shape of the time series?
  • Symmetries: Kyle Cranmer + colleagues are exploring ML algorithms that respect existing symmetries in the problem. Might be worth thinking about that.

Will need to add more as we think of them.

General Overview of Transient Classification Methods

Here's an initial (very broad, quite cursory) list of mostly very recent literature in this domain, as a starting point. This is not comprehensive, so additions are welcome:

General Transient Classification

Variable Stars

SNe

Related Fields

*https://journals.aps.org/prd/abstract/10.1103/PhysRevD.95.104059

My cursory reading is that there is a lot already out there on this topic, and it seems to me that perhaps the Berkeley group around Josh Bloom has done the most comprehensive development of a transient classification system so far.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.