Code Monkey home page Code Monkey logo

airpy's Introduction

airPy

The airPy package was developed to extract high-resolution satellite data from Google Earth Engine and calculate Machine Learning-ready features for air pollution studies.

The code in this repository performs the following tasks:

1. Download satellite data from Google Earth Engine

  • Per specified latitude, longitude, and AOI buffer extent, data is downloaded from Google Earth Engine. The download job is fully specified by the user-generated configuration file, which includes all details for the data to be downloaded and processed, including: collection type, latitude/longitude points, analysis period, and buffer size.
  • Data can be saved as an xarray format covering the user-specified AOI extent, or as individual images per latitude, longitude point given.

2. Generate Machine Learning-ready features

airPy feature extraction

The diagram below depicts an example of the extraction process over the extent of Australia, matching the 11.1x11.1km spatial resolution of the MOMO-Chem (Multi-mOdel Multi-cOnstituent Chemical data assimilation) model output. Per latitude, longitude point on the MOMO-Chem grid:

  • Query the GEE dataset of interest
  • Create a ~55.5km radius buffer around the point of interest
  • Extract the data over the buffer AOI
  • Process features of interest (i.e. maximum population per grid point from World Population dataset, percent of each land cover class from MODIS dataset, etc.)

airPy AOI extraction process.

Installation

Clone the repo:

git clone [email protected]:kelsdoerksen/airPy.git

Navigate to the airPy folder and install the package using pip.

pip install airpy

Install package dependencies using

pip install -r requirements.txt  

To use the Google Earth Engine API, you must create and authenticate a Google Earth Engine account. Information on the Earth Engine Python API can be found here.

Creating ML-ready features from GEE data with airPy

Running the airPy pipeline

The airPy pipeline job is fully specified by a configration dictionary generated by the GenerateConfig class. To create a new configuration and run the pipeline use the command

python run_airpy.py

and specify the various parameters of the data of interest. The available configurable parameters are:

  • --gee_data: The name of the GEE dataset of interest, either modis, pop, fire, or nightlight.
    • modis: MCD12Q1.061 MODIS Land Cover Type Yearly Global 500m, available 2001-01-01 to 2021-01-01
    • pop: GPWv411: Population Density (Gridded Population of the World Version 4.11), available 2000-01-01 to 2020-01-01
    • fire: FireCCI51: MODIS Fire_cci Burned Area Pixel Product, Version 5.1, available 2001-01-01 to 2020-12-01
    • nightlight: VIIRS Nighttime Day/Night Band Composites Version 1, available 2012-04-01 to 2023-01-01
  • --region: Boundary region on Earth to extract data in (latitudes), (longitudes). Must be one of:
    • globe: (-90, 90),(-180, 180)
    • europe (35, 65),(-10, 25)
    • asia: (20, 50),(100, 145)
    • australia: (-50, -10),(130, 170)
    • north_america: (20, 55),(-125, -70)
    • west_europe: (25, 80),(-20, 10)
    • east_europe: (25, 80),(10, 40)
    • west_north_america: (10, 80),(-140, -95)
    • east_north_america: (10, 80), (-95, -50)
    • toar2: Locations of TOAR2 stations based on TOAR2 metadata
  • --date: Date of query. Must be in format 'YYYY-MM-DD'
  • --analysis_type: Type of analysis for data extraction. Must be one of collection, images.
  • --add_time: Specify if time component should be added to collection xarray. Useful for integrating into time series ML datasets. One of 'y' or 'n'.
  • --buffer_size: Specify region of interest (ROI) buffer extent. Units in metres.
  • --config-dir: Specify the output directory for the config file.
  • --save_dir: Specify run rave directory. Example:
python run_airpy.py --gee_data fire --region australia --date 2020-01-01 --analysis_type collection --buffer_size 55500 --configs_dir /configs --save_dir /runs

Generates a config file named config_australia_fire_2020-01-01_buffersize_55500_collection.json and kicks of the airpy job. To look at the help file for more information on paramaters, run the command python run_airpy.py --help.

Testing

Tests for each script are stored in the airpy/tests folder. pytest is used to test scripts in the airpy folder via the following command:

python3 -m pytest

Citing

If you found this code useful in your research, please cite via the following: K.Doerksen "airPy: Generating AI-ready datasets in python for air quality studies using Google Earth Engine", University of Oxford, NASA Jet Propulsion Lab, 2023.

airpy's People

Contributors

kelsdoerksen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.