Code Monkey home page Code Monkey logo

splid-challenge's Introduction

AI SSA Algorithm

Satellite Pattern-of-Life Identification Algorithm developed for the MIT ARCLab Prize for AI Innovation in Space 2024 Challenge.

ChallengeChallenge PaperSPLID Dataset

Python License

Directory Overview

.
├── base/
│   ├── classifier.py        # Classifier code (prediction & evaluation)
│   ├── datahandler.py       # Data preprocessing, dataset generation
│   ├── evaluation.py        # Challenge-Evaluation
│   ├── localizer.py         # Localizer / Changepoint-Detection code
│   ├── prediction_models.py # Wrapper for creating and training ML models
│   ├── shap_analysis.py     # Code to perform SHAP feature analysis
│   └── utils.py             # Miscellaneous code
│
├── img/                        # Images used in this README
├── models/                     # Trained models and pre-fitted scalers
├── classifier_playground.ipynb # Jupyter notebook to train, evaluate and explore the classifier algorithm
├── classifier_sweeper.py       # Code to execute thorough parameter studies for the classifier
├── localizer_playground.ipynb  # Jupyter notebook to train, evaluate and explore the changepoint-detection algorithm
├── localizer_sweeper.py        # Code to execute thorough parameter studies for the localizer
├── data_analysis.ipynb         # Notebook to explore the datasets (unstructured)
├── model_analysis.ipynb        # Notebook to perform SHAP feature importance analysis
└── submission.py               # File to execute the full inference algorithm (changepoint localization, classification, postprocessing)

Installation

The repository was tested using Python 3.10 together with TensorFlow 2.15.0 on Ubuntu 22.04

The dataset can be downloaded from here. Make sure to download the phase_2 dataset, and extract it into the dataset directory.

The dataset directory should look like this:

dataset/
└── phase_2/
    ├── test/
    ├── training/
    ├── test_label.csv
    └── train_label.csv

A conda installation is recommended:

conda create -n ai_ssa python=3.10
conda activate ai_ssa
pip install -r requirements.txt

To recreate the challenge results:

python submission.py    # this will perform a full inference cycle, and should yield an F2 score of 0.805

Training and Evaluating Custom Models

Instructions on how to train and evaluate custom models can be found in the corresponding Jupyter notebooks. For larger parameter studies, it is recommended to use the Weights & Biases parameter sweeper implemented in classifier_sweeper.py and localizer_sweeper.py. The files have been pre-filled with the paramters used in the submission models; the only difference is that legacy_diff_transform has been disabled, as it corresponds to a bug that was fixed in the meantime. Please note that when using newly trained models for inference in submission.py, the dataset generation parameters will need to be changed according to the training configuration.

Models that were re-trained with otherwise identical settings after implementing the fix can also be found in the models/ directory. For convenience the FIXED_DIFF_TRANSFORM_MODELS was added on top of the submission.py, which allows for an easy comparison of the models before and after the fix was applied.

Analyzing Trained Models

The model_analysis.ipynb notebook contains the code necessary to run the SHAP GradientExplainer on the trained models. While model explainability is a complex topic in itself, the plots do give a general idea of why the model is acting in a certain way. Still, the importance assigned to certain features by the GradientExplainer does not always match experimental results.

splid-challenge's People

Contributors

davidbaldsiefen avatar

Stargazers

Christoph avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.