Code Monkey home page Code Monkey logo

birdsat's Introduction

BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping

๐Ÿฆข Dataset Released: Cross-View iNAT Birds 2021

This cross-view birds species dataset consists of paired ground-level bird images and satellite images, along with meta-information associated with the iNaturalist-2021 dataset.

Satellite images along with meta-information - Link

iNaturalist Images - Link

CiNAT-Birds-2021

Computer Vision Tasks

  1. Fine-Grained image classification
  2. Satellite-to-bird image retrieval
  3. Bird-to-satellite image retrieval
  4. Geolocalization of Bird Species

An example of task 3 is shown below:

Retrieval

๐Ÿ‘จโ€๐Ÿ’ป Getting Started

Setting up

  1. Clone this repository:
git clone https://github.com/mvrl/BirdSAT.git
  1. Clone the Remote-Sensing-RVSA repository inside BirdSAT:
cd BirdSAT
git clone https://github.com/ViTAE-Transformer/Remote-Sensing-RVSA.git
  1. Append the code for CVMMAE present in utils_model/CVMMAE.py to the file present in Remote-Sensing-RVSA/MAEPretrain_SceneClassification/models_mae_vitae.py

  2. Download pretrained satellite image encoder from - Link and place inside folder pretrained_models. You might get an error while loading this model. You need to set the option kernel=3 in the file Remote-Sensing-RVSA/MAEPretrain_SceneClassification/models_mae_vitae.py in the class MaskedAutoencoderViTAE.

  3. Download all datasets, unzip them and place inside folder data.

Installing Required Packages

There are two options to setup your environment to be able to run all the functions in the repository:

  1. Using Dockerfile provided in the repository to create a docker image with all required packages:
    docker build -t <your-docker-hub-id>/birdsat .
  2. Creating conda Environment with all required packages:
    conda create -n birdsat python=3.10 && \
    conda activate birdsat && \
    pip install requirements.txt

Additionally, we have hosted a pre-built docker image on docker hub with tag srikumar26/birdsat:latest for use.

๐Ÿ”ฅ Training Models

  1. Setup all the parameters of interest inside config.py before launching the training script.
  2. Run pre-training by calling:
    python pretrain.py
  3. Run fine-tuning by calling:
    python finetune.py

โ„๏ธ Pretrained Models

Download pretrained models from the given links below:

Model Type Download Url
CVE-MAE Link
CVE-MAE-Meta Link
CVM-MAE Link
CVM-MAE-Meta Link

๐Ÿ“‘ Citation

@inproceedings{sastry2024birdsat,
  title={BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping},
  author={Srikumar, Sastry and Subash, Khanal and Aayush, Dhakal and Huang, Di and Nathan, Jacobs},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year={2024}
}

๐Ÿ” Additional Links

Check out our lab website for other interesting works on geospatial understanding and mapping;

  • Multi-Modal Vision Research Lab (MVRL) - Link
  • Related Works from MVRL - Link

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.