Code Monkey home page Code Monkey logo

galaxy-classification-yotta-project's Introduction

yotta

Galaxy Zoo

Python Version License License Code Style Gitmoji

This project aims to classify the morphologies of distant galaxies using deep neural networks.

It is based on the Kaggle Galaxy Zoo Challenge.

Documentation

Project's assignement as well as inspirational papers on the topic are available in doc/.

To better understand the task to be learned, you could give it a go yourself ! try it here

Installation

  1. (Optional) Install poetry if you don't have it already:
make setup-poetry
  1. Install dependencies:
poetry install
  1. To download the dataset, you can install Kaggle's API (you need to setup your credentials), and then download the dataset:
pip install --user kaggle
kaggle competitions download -c galaxy-zoo-the-galaxy-challenge
  1. You're good to go!

Train

Create the training labels for classification

poetry run python -m gzoo.app.make_labels <data_dir>

required arguments:

  • <data_dir>: specifies the location of the dataset directory containing the original regression labels training_solutions_rev1.csv

Run the classification pipeline:

poetry run python -m gzoo.app.train -o config/train_classification.yaml

script option:

  • -o: specify the .yamlconfig file to read options from. Every run config option should be listed in this file (the default file for this is config/train_classification.yaml) and every option in the yaml file can be overloaded on the fly at the command line.

For instance, if you are fine with the values in the yaml config file but you just want to change the epochs number, you can either change it in the config file or you can directly run:

poetry run python -m gzoo.app.evaluate -o config/train.yaml --epochs 50

This will use all config values from config/train.yaml except the number of epochs which will be set to 50.

main run options:

  • --seed: seed for initializing training. (default: None)
  • --epochs: total number of epochs (default: 90)
  • --batch-size: batch size (default: 256)
  • --workers: number of threads (default: 4)
  • --model.arch: model architecture to be used(default: resnet18)
  • --model.pretrained: use pre-trained model (default: False)
  • --optimizer.lr: optimizer learning rate (default: 3.e-4 with Adam)
  • --optimizer.momentum: optimizer momentum (default: 0.9)
  • --optimizer.weight-decay: optimizer weights regularization (L2) (default 1.e-4)

Predict

From the web app

streamlit run gzoo/interface/web_app.py

From the command line:

poetry run python -m gzoo.app.predict -o config/predict.yaml

Config works the same as for train.py, default config is at config/predict.yaml. The dataset directory specified in the config must contain an images_test_rev1 that contains itself the images to predict, as well as the all_ones_benchmark.csv output template from the Kaggle project's data sources.

A 1-image example is provided which you can run with:

poetry run python -m gzoo.app.predict -o config/predict.yaml --dataset example

Developer

Activate pre-commit hooks:

poetry run pre-commit install

galaxy-classification-yotta-project's People

Contributors

jeremie-koster avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.