Code Monkey home page Code Monkey logo

galaxy_zoo_capsule's Introduction

This project is developed for Python3.5 interpreter on linux machine. Using Anaconda virtual environment is recommended.

To install dependencies, simply run:

pip install -r requirment.txt

This project uses TensorFlow, a machine learning library developed and maintained by Google in principle.

We use tensorflow version 1.4.0 (required),

pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.4.0-cp35-cp35m-linux_x86_64.whl

users can choose to install its GPU optimized version accordingly,

pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.4.0-cp35-cp35m-linux_x86_64.whl

To install cv2 in Anaconda (optional):

conda install -c menpo opencv=2.4.11

Or via pip:

pip install opencv-python

Data set

Small galaxy zoo is available at https://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge/data. I downloaded and used only the training images and training labels.

To see how I select images that has either elliptical or sprial galaxies, checkout the jupyter notebook in data folder. Every images is resized to 212x212 pixels.

To see how does the overlapping work, checkout the python galaxy_data.py once you have the data set ready. Every overlapping images is made from two randomly selected galaxies images with lower or equal to 50 piexls offest on both axises from their center. The label of each galaxy image is a one hot vector with two elements represent either it is elliptical or spiral, the synthesized label is made from the result of logical or operation.

Train & Test

Simply,

python galaxy_main.py

Result

After 90,000 batch iterations (32 overlapping images per batch) with 1e-4 learning rate, the model reaches around 30% error rate on the test data set. I retrain the model twice and it spits out the same result.

Below is another learning curve for 1e-3 learning rate achieve achieve the same performace in 10,000 batch iterations

In comparison to https://github.com/yhyu13/tf_CapsNet, which is a project done on synthesized hand written digits images and is the project I mimicked , the CapsNet achieve 10% error rate on test data set. The lesson learned is that CpasNet is capable of recognizing elliptical and spiral galaxies when they overlapped but not as good as recognizing hand written digits. One challenge I realized was the reconstruction is particularly hard for large image input, thus, this model is not trained with reconstruction error.

TO DO

  • Spiral, elliptical, irregular galaxies classification (and more diversed sythesized images)
  • Find best threshold for FS/FN
  • Train with AlexNet
  • Change the method to generate training data: should be

(i) spiral only (no synthesized image) (ii) elliptical only (no synthesized image) (iii) synthesized images (iv) false negative examples

galaxy_zoo_capsule's People

Contributors

yhyu13 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.