Code Monkey home page Code Monkey logo

seti1's Introduction


Search for Extra Terrestrial Intelligence (SETI)

Project overview:

Each night, using the Allen Telescope Array (ATA) in northern California, the SETI Institute scans the sky at various radio frequencies, observing star systems with known exoplanets, searching for faint but persistent signals. The current signal detection system is programmed to search only for particular kinds of signals: narrow-band carrier waves. However, the detection system sometimes triggers on signals that are not narrow-band signals (with unknown efficiency) and are also not explicitly-known radio frequency interference (RFI). There seems to be various categories of these kinds of events that have been observed in the past.

Our goal is to classify these accurately in real-time. This may allow the signal detection system to make better observational decisions, increase the efficiency of the nightly scans, and allow for explicit detection of these other signal types.

For more information refer to SETI hackathon page.

When you’ve completed this pattern, you will understand how to:

  • Convert signal data into image data
  • Build and train a convolutional neural networks
  • Display and share results in Jupyter Notebooks This pattern will assist application developers who need to efficiently build powerful deep learning applications and use GPUs to train the model quickly.

Notebooks:

This repository includes 3 parts:

1. Preparing dataset

  • Converting images to binary files using Numpy (SETI_img_to-binary.ipynb)
    • In this notebook we read the Basic 4 dataset and convert signals into a binary file. Also, we split data into train/test datasets.
  • Optional: Converting images to binary files using Spark (SETI_img_to_binary_spark.ipynb)
    • In this notebook we read the Basic 4 dataset through Spark, and convert signals into a binary file. It is an optional notebook and you dont have to run it if you have already converted the images to binary files.

2. Classification

  • Classification of images using CCN on Single GPU (SETI_CNN_Tf_SingleGpu.ipynb)
    • In this Notebook, we will use the famous SETI Dataset to build a Convolutional Neural Networks capable to perform signals classification. CNN will say, with some associated error, what type of signal is the presented input.
    • In our case, as we are running this notebook on IBM PowerAI, you hvae access to multi GPU, but we use one of the GPUs in this notebook, for the sake of simplicity.
  • Optional: Classification of images using CCN on Multi GPU (SETI_CNN_Tf_MultiGpu.ipynb)
    • This Notebook, builds a Convolutional Neural Networks, but using multi GPUs. You will use IBM PowerAI with multiple GPU to train the model in parallel manner.
    • You can run this notebook in case you have access to an environment with multiple GPUs.

3. Prediction

  • Use the trained model for prediciton (SETI_prediction.ipynb)
    • In this notebook you can load a pre-trained model and predict the signal class.

Performance

Convelutional Neural Network involves a lot of matrix and vector multiplications that can parallelized, so GPUs can overperform, because GPUs were designed to handle these kind of matrix operations in parallel!

Why GPU overperforms?

A single core CPU takes a matrix operation in serial, one element at a time. But, a single GPU could have hundreds or thousands of cores, while a CPU typically has no more than a few cores.

How to use GPU with TensorFlow?

It is important to notice that if both CPU and GPU are available on the machine that you are running the noebook, and if a TensorFlow operation has both CPU and GPU implementations, the GPU devices will be given priority when the operation is assigned to a device.

Benchmark:

  • SETI_single_gpu_train.py achieves ~72% accuracy after 3k epochs of data (75K steps).
  • Speed: With batch_size 128.
  • Notice: The model is not optimized to reach to its highest accuracy, you can achive better results tuning the parameters.
CPU Architecture CPU cores  Memory  GPU  Step time (sec/batch)   Accuracy
POWER8 40 256 GB 1 x Tesla K80 ~0.127 ~72% at 75K steps (3 hours)
POWER8 32 128 GB 1 x Tesla P100 w/NVLink np8g4 ~0.035 ~72% at 75K steps (1 hour)
  • SETI_multi_gpu_train.py achieves ~72% accuracy after 75K steps.
  • Speed: With batch_size 128.
  • Notice: The model is not optimized to reach to highest accuracy, and you can achive better results tuning the parameters.
CPU Architecture CPU cores  Memory  GPU  Step time (sec/batch)   Accuracy
POWER8 160 1 TB 4 x Tesla K80 ~0.066 ~72% at 75K steps (83 minutes)
POWER8 64 256 GB 2 x Tesla P100 w/NVLink np8g4 ~0.033 ~72% at 75K steps (40 minutes)
POWER8 128 512 GB 4 x Tesla P100 w/NVLink np8g4 ~0.017 ~72% at 75K steps (20 minutes)

seti1's People

Contributors

saeedaghabozorgi avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

markstur

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.