Code Monkey home page Code Monkey logo

rio-tiler-pds's Introduction

Rio-Tiler-PDS: A rio-tiler plugin for Public Datasets

rio-tiler-pds

A rio-tiler plugin to read from publicly-available datasets.

Test Coverage Package version Downloads Lincense

Important This is the new module for rio-tiler missions specific (ref: cogeotiff/rio-tiler#195)


Documentation: https://cogeotiff.github.io/rio-tiler-pds/

Source Code: https://github.com/cogeotiff/rio-tiler-pds


Installation

You can install rio-tiler-pds using pip

$ pip install -U pip
$ pip install rio-tiler-pds

or install from source:

$ pip install -U pip
$ pip install git+https://github.com/cogeotiff/rio-tiler-pds.git

Datasets

Data Level/Product Format Owner Region Bucket Type
Sentinel 2 L1C JPEG2000 Sinergise / AWS eu-central-1 Requester-pays
Sentinel 2 L2A JPEG2000 Sinergise / AWS eu-central-1 Requester-pays
Sentinel 2 L2A COG Digital Earth Africa / AWS us-west-2 Public
Sentinel 1 L1C GRD (IW, EW, S1-6) COG (Internal GCPS) Sinergise / AWS eu-central-1 Requester-pays
Landsat Collection 2 L1,L2 COG USGS / AWS us-west-2 Requester-pays
CBERS 4/4A L2/L4 COG AMS Kepler / AWS us-east-1 Requester-pays
MODIS (modis-pds) MCD43A4, MOD09GQ, MYD09GQ, MOD09GA, MYD09GA GTiff (External Overviews) - us-west-2 Public
MODIS (astraea-opendata) MCD43A4, MOD11A1, MOD13A1, MYD11A1 MYD13A1 COG Astraea / AWS us-west-2 Requester-pays
Copernicus Digital Elevation Model GLO-30, GLO-90 COG Sinergise / AWS eu-central-1 Public

Adding more dataset:

If you know of another publicly-available dataset that can easily be described with a "scene id", please feel free to open an issue.

Warnings

Requester-pays Buckets

On AWS, sentinel2, sentinel1, cbers and modis (in astraea-opendata) datasets are stored in requester pays buckets. This means that the cost of GET and LIST requests and egress fees for downloading files outside the AWS region will be charged to the accessing users, not the organization hosting the bucket. For rio-tiler and rio-tiler-pds to work with such buckets, you'll need to set AWS_REQUEST_PAYER="requester" in your shell environment.

Partial reading on Cloud hosted dataset

When reading data, rio-tiler-pds performs partial reads when possible. Hence performance will be best on data stored as Cloud Optimized GeoTIFF (COG). It's important to note that Sentinel-2 scenes hosted on AWS are not in Cloud Optimized format but in JPEG2000. Partial reads from JPEG2000 files are inefficient, and GDAL (the library underlying rio-tiler-pds and rasterio) will need to make many GET requests and transfer a lot of data. This will be both slow and expensive, since AWS's JPEG2000 collection of Sentinel 2 data is stored in a requester pays bucket.

Ref: Do you really want people using your data blog post.

Overview

Readers

Each dataset has its own submodule (e.g sentinel2: rio_tiler_pds.sentinel.aws)

from rio_tiler_pds.landsat.aws import LandsatC2Reader
from rio_tiler_pds.sentinel.aws import S1L1CReader
from rio_tiler_pds.sentinel.aws import (
    S2JP2Reader,  # JPEG2000
    S2COGReader,   # COG
)

from rio_tiler_pds.cbers.aws import CBERSReader
from rio_tiler_pds.modis.aws import MODISPDSReader, MODISASTRAEAReader
from rio_tiler_pds.copernicus.aws import Dem30Reader, Dem90Reader

All Readers are subclass of rio_tiler.io.BaseReader and inherit its properties/methods.

Properties

  • bounds: Scene bounding box
  • crs: CRS of the bounding box
  • geographic_bounds: bounding box in geographic projection (e.g WGS84)
  • minzoom: WebMercator MinZoom (e.g 7 for Landsat 8)
  • maxzoom: WebMercator MaxZoom (e.g 12 for Landsat 8)

Methods

  • info: Returns band's simple info (e.g nodata, band_descriptions, ....)
  • statistics: Returns band's statistics (percentile, histogram, ...)
  • tile: Read web mercator map tile from bands
  • part: Extract part of bands
  • preview: Returns a low resolution preview from bands
  • point: Returns band's pixel value for a given lon,lat
  • feature: Extract part of bands

Other

  • bands (property): List of available bands for each dataset

Scene ID

All readers take scene id as main input. The scene id is used internaly by the reader to derive the full path of the data.

e.g: Landsat on AWS

Because the Landsat AWS PDS follows a regular schema to store the data (s3://{bucket}/c1/L8/{path}/{row}/{scene}/{scene}_{band}.TIF"), we can easily reconstruct the full band's path by parsing the scene id.

from rio_tiler_pds.landsat.aws import LandsatC2Reader
from rio_tiler_pds.landsat.utils import sceneid_parser

sceneid_parser("LC08_L2SP_001062_20201031_20201106_02_T2")

> {'sensor': 'C',
 'satellite': '08',
 'processingCorrectionLevel': 'L2SP',
 'path': '001',
 'row': '062',
 'acquisitionYear': '2020',
 'acquisitionMonth': '10',
 'acquisitionDay': '31',
 'processingYear': '2020',
 'processingMonth': '11',
 'processingDay': '06',
 'collectionNumber': '02',
 'collectionCategory': 'T2',
 'scene': 'LC08_L2SP_001062_20201031_20201106_02_T2',
 'date': '2020-10-31',
 '_processingLevelNum': '2',
 'category': 'standard',
 'sensor_name': 'oli-tirs',
 '_sensor_s3_prefix': 'oli-tirs',
 'bands': ('QA_PIXEL',
  'QA_RADSAT',
  'SR_B1',
  'SR_B2',
  'SR_B3',
  'SR_B4',
  'SR_B5',
  'SR_B6',
  'SR_B7',
  'SR_QA_AEROSOL',
  'ST_ATRAN',
  'ST_B10',
  'ST_CDIST',
  'ST_DRAD',
  'ST_EMIS',
  'ST_EMSD',
  'ST_QA',
  'ST_TRAD',
  'ST_URAD')}

with LandsatC2Reader("LC08_L2SP_001062_20201031_20201106_02_T2") as landsat:
    print(landsat._get_band_url("SR_B2"))

> s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/001/062/LC08_L2SP_001062_20201031_20201106_02_T2/LC08_L2SP_001062_20201031_20201106_02_T2_SR_B2.TIF

Each dataset has a specific scene id format:

!!! note "Scene ID formats"

- Landsat
    - link: [rio_tiler_pds.landsat.utils.sceneid_parser](https://github.com/cogeotiff/rio-tiler-pds/blob/e4421d3cf7c23b7b3552b8bb16ee5913a5483caf/rio_tiler_pds/landsat/utils.py#L35-L56)
    - regex: `^L[COTEM]0[0-9]_L\d{1}[A-Z]{2}_\d{6}_\d{8}_\d{8}_\d{2}_(T1|T2|RT)$`
    - example: `LC08_L1TP_016037_20170813_20170814_01_RT`

- Sentinel 1 L1C
    - link: [rio_tiler_pds.sentinel.utils.s1_sceneid_parser](https://github.com/cogeotiff/rio-tiler-pds/blob/e4421d3cf7c23b7b3552b8bb16ee5913a5483caf/rio_tiler_pds/sentinel/utils.py#L98-L121)
    - regex: `^S1[AB]_(IW|EW)_[A-Z]{3}[FHM]_[0-9][SA][A-Z]{2}_[0-9]{8}T[0-9]{6}_[0-9]{8}T[0-9]{6}_[0-9A-Z]{6}_[0-9A-Z]{6}_[0-9A-Z]{4}$`
    - example: `S1A_IW_GRDH_1SDV_20180716T004042_20180716T004107_022812_02792A_FD5B`

- Sentinel 2 JPEG2000 and Sentinel 2 COG
    - link: [rio_tiler_pds.sentinel.utils.s2_sceneid_parser](https://github.com/cogeotiff/rio-tiler-pds/blob/e4421d3cf7c23b7b3552b8bb16ee5913a5483caf/rio_tiler_pds/sentinel/utils.py#L25-L60)
    - regex: `^S2[AB]_[0-9]{2}[A-Z]{3}_[0-9]{8}_[0-9]_L[0-2][A-C]$` or `^S2[AB]_L[0-2][A-C]_[0-9]{8}_[0-9]{2}[A-Z]{3}_[0-9]$`
    - example: `S2A_29RKH_20200219_0_L2A`, `S2A_L1C_20170729_19UDP_0`, `S2A_L2A_20170729_19UDP_0`

- CBERS
    - link: [rio_tiler_pds.cbers.utils.sceneid_parser](https://github.com/cogeotiff/rio-tiler-pds/blob/e4421d3cf7c23b7b3552b8bb16ee5913a5483caf/rio_tiler_pds/cbers/utils.py#L28-L43)
    - regex: `^CBERS_(4|4A)_\w+_[0-9]{8}_[0-9]{3}_[0-9]{3}_L\w+$`
    - example: `CBERS_4_MUX_20171121_057_094_L2`, `CBERS_4_AWFI_20170420_146_129_L2`, `CBERS_4_PAN10M_20170427_161_109_L4`, `CBERS_4_PAN5M_20170425_153_114_L4`, `CBERS_4A_WPM_20200730_209_139_L4`

- MODIS (PDS and Astraea)
    - link: [rio_tiler_pds.modis.utils.sceneid_parser](https://github.com/cogeotiff/rio-tiler-pds/blob/c533d38330f46738c46cb9927dbe91b299dc643d/rio_tiler_pds/modis/utils.py#L29-L42)
    - regex: `^M[COY]D[0-9]{2}[A-Z0-9]{2}\.A[0-9]{4}[0-9]{3}\.h[0-9]{2}v[0-9]{2}\.[0-9]{3}\.[0-9]{13}$`
    - example: `MCD43A4.A2017006.h21v11.006.2017018074804`

Band Per Asset/File

rio-tiler-pds Readers assume that bands (e.g eo:band in STAC) are stored in separate files.

$ aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/001/062/LC08_L2SP_001062_20201031_20201106_02_T2/ --request-payer
LC08_L2SP_001062_20201031_20201106_02_T2_ANG.txt
LC08_L2SP_001062_20201031_20201106_02_T2_MTL.json
LC08_L2SP_001062_20201031_20201106_02_T2_MTL.txt
LC08_L2SP_001062_20201031_20201106_02_T2_MTL.xml
LC08_L2SP_001062_20201031_20201106_02_T2_QA_PIXEL.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_QA_RADSAT.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B1.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B2.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B3.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B4.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B5.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B6.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_B7.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_QA_AEROSOL.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_SR_stac.json
LC08_L2SP_001062_20201031_20201106_02_T2_ST_ATRAN.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_B10.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_CDIST.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_DRAD.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_EMIS.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_EMSD.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_QA.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_TRAD.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_URAD.TIF
LC08_L2SP_001062_20201031_20201106_02_T2_ST_stac.json
LC08_L2SP_001062_20201031_20201106_02_T2_thumb_large.jpeg
LC08_L2SP_001062_20201031_20201106_02_T2_thumb_small.jpeg

When reading data or metadata, readers will merge them.

e.g

with S2COGReader("S2A_L2A_20170729_19UDP_0") as sentinel:
    img = sentinel.tile(78, 89, 8, bands=("B01", "B02"))
    assert img.data.shape == (2, 256, 256)

    stats = sentinel.statistics(bands=("B01", "B02"))
    print(stats)
    >> {
      'B01': BandStatistics(
        min=2.0,
        max=17132.0,
        mean=2183.7570706659685,
        count=651247.0,
        sum=1422165241.0,
        std=3474.123975478363,
        median=370.0,
        majority=238.0,
        minority=2.0,
        unique=15112.0,
        histogram=[
          [476342.0, 35760.0, 27525.0, 24852.0, 24379.0, 23792.0, 20891.0, 13602.0, 3891.0, 213.0],
          [2.0, 1715.0, 3428.0, 5141.0, 6854.0, 8567.0, 10280.0, 11993.0, 13706.0, 15419.0, 17132.0]
        ],
        valid_percent=62.11,
        masked_pixels=397329.0,
        valid_pixels=651247.0,
        percentile_2=179.0,
        percentile_98=12465.0
      ),
      'B02': BandStatistics(
        min=1.0,
        max=15749.0,
        mean=1941.2052554560712,
        count=651247.0,
        sum=1264204099.0,
        std=3130.545395156859,
        median=329.0,
        majority=206.0,
        minority=11946.0,
        unique=13904.0,
        histogram=[
          [479174.0, 34919.0, 27649.0, 25126.0, 24913.0, 24119.0, 20223.0, 12097.0, 2872.0, 155.0],
          [1.0, 1575.8, 3150.6, 4725.4, 6300.2, 7875.0, 9449.8, 11024.6, 12599.4, 14174.199999999999, 15749.0]
        ],
        valid_percent=62.11,
        masked_pixels=397329.0,
        valid_pixels=651247.0,
        percentile_2=134.0,
        percentile_98=11227.079999999958
      )}

      print(stats["B01"].min)
      >> 2.0

Mosaic Reader: Copernicus DEM

The Copernicus DEM GLO-30 and GLO-90 readers are not per scene but mosaic readers. This is possible because the dataset is a global dataset with file names having the geo-location of the COG, meaning we can easily contruct a filepath from a coordinate.

from rio_tiler_pds.copernicus.aws import Dem30Reader

with Dem30Reader() as dem:
    print(dem.assets_for_point(-57.2, -11.2))

>> ['s3://copernicus-dem-30m/Copernicus_DSM_COG_10_S12_00_W058_00_DEM/Copernicus_DSM_COG_10_S12_00_W058_00_DEM.tif']

Changes

See CHANGES.md.

Contribution & Development

See CONTRIBUTING.md

License

See LICENSE.txt

Authors

The rio-tiler project was begun at Mapbox and has been transferred in January 2019.

See AUTHORS.txt for a listing of individual contributors.

rio-tiler-pds's People

Contributors

dvd3v avatar f-skold avatar fredliporace avatar kylebarron avatar vincentsarago avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rio-tiler-pds's Issues

Support for CBERS-4A

Hi,

I'm working on ingesting CBERS-4A images to AWS and extending rio-tiler-pds. Will open a WIP PR in a few moments to allow you to discuss, if desired, the subject while it being developed.

ASTER - publicly available dataset

The Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) satellite is a publicly available dataset with 3 VNIR, 6 SWIR, and 5 thermal bands available both through Google Earth Engine and AWS. Data collected from this platform are extensively used in the geosciences and environmental monitoring. I would like to request this dataset be added to rio-tiler-pds.

AWS id = arn:aws:s3:::terrafusiondatasampler
GEE id = ASTER/AST_L1T_003

Web resources:
https://registry.opendata.aws/terrafusion/
https://developers.google.com/earth-engine/datasets/catalog/ASTER_AST_L1T_003#description

B10 Band unavailable.

Hi Vincent,

I'm battle testing Rio-tiler-PDS and so far some issues,

Mainly ran into cloud detection issues, as i could not get the "B10" band from L2A.

I'm also looking for the "CLD" bands, do you know how to get that one?

The B10 is available for on AWS, however Rio-tiler-PDS cannot retrieve it.

aws s3 ls s3://sentinel-s2-l1c/tiles/44/P/LU/2020/10/12/0/ --request-payer requester
2020-10-12 11:25:38 3672535 B01.jp2
2020-10-12 11:25:38 98640868 B02.jp2
2020-10-12 11:25:38 99913324 B03.jp2
2020-10-12 11:25:38 106900680 B04.jp2
2020-10-12 11:25:38 30355344 B05.jp2
2020-10-12 11:25:38 32663001 B06.jp2
2020-10-12 11:25:38 33809595 B07.jp2
2020-10-12 11:25:38 117991600 B08.jp2
2020-10-12 11:25:38 3420599 B09.jp2
2020-10-12 11:25:38 2114876 B10.jp2
2020-10-12 11:25:38 31471597 B11.jp2
2020-10-12 11:25:38 30893542 B12.jp2
2020-10-12 11:25:38 33728721 B8A.jp2
2020-10-12 11:25:38 135421286 TCI.jp2

Numpy Concatation Error on S2COGREADER.part

Hey all, I am trying to use S2COGREADER to read part of a raster according to a bbox..
here is my geojson courtesy of geojson.io

geojson = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              35.013427734375,
              2.0869407308811065
            ],
            [
              37.46337890624999,
              2.0869407308811065
            ],
            [
              37.46337890624999,
              4.850154078505659
            ],
            [
              35.013427734375,
              4.850154078505659
            ],
            [
              35.013427734375,
              2.0869407308811065
            ]
          ]
        ]
      }
    }
  ]
}

I then use featureBounds to get the bbox and pass it to the reader with a band expression:

geom = geojson["features"][0]["geometry"]
with S2COGReader("S2A_29RKH_20200219_0_L2A") as sen:
    data = sen.part(bbox=featureBounds(geom), expression="(B03-B11)/(B03+B11)")

I get this error

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 618 and the array at index 1 has size 555

I have tried different geometries and different ID numbers. Any help?

Add reader for the Sentinel 2 DigitalTwin reader

Hack: to get the bbox, I'm using a small GeoJSON https://gist.github.com/vincentsarago/ba3a2c026b47d52fd8a9b65ba9e1696d which could be provided with rio-tiler-pds module 🤷‍♂️

import attr
from typing import Any, Dict, Tuple, Type, Union

from rio_tiler.io import COGReader, MultiBandReader
from rasterio.features import bounds as featureBounds
from rio_tiler import constants
from morecantile import TileMatrixSet


default_bands = (
    "B02",
    "B03",
    "B04",
    "B08",
    "B11",
    "B12",
)


def get_grid_bbox(name: str) -> Tuple[float, float, float, float]:
    feat = list(filter(lambda x: x["properties"]["GZD"] == name , mgrs_grid["features"]))[0]
    return featureBounds(feat["geometry"])


@attr.s
class S2DigitalTwinReader(MultiBandReader):
    """Sentinel DigitalTwin Reader"""

    grid: str = attr.ib()
    year: int = attr.ib()
    month: int = attr.ib()
    day: int = attr.ib()
    reader: Type[COGReader] = attr.ib(default=COGReader)
    reader_options: Dict = attr.ib(factory=dict)
    tms: TileMatrixSet = attr.ib(default=constants.WEB_MERCATOR_TMS)
    minzoom: int = attr.ib(default=6)
    maxzoom: int = attr.ib(default=10)

    bands: tuple = attr.ib(init=False, default=default_bands)

    _scheme: str = "s3"
    _hostname: str = "sentinel-s2-l2a-mosaic-120"
    _prefix: str = "{year}/{month}/{day}/{grid}"

    def __attrs_post_init__(self):
        """Fetch item.json and get bounds and bands."""
        self.bounds = get_grid_bbox(self.grid)

    def _get_band_url(self, band: str) -> str:
        """Validate band name and return band's url."""
        band = band if len(band) == 3 else f"B0{band[-1]}"

        if band not in self.bands:
            raise InvalidBandName(f"{band} is not valid.\nValid bands: {self.bands}")

        prefix = self._prefix.format(year=self.year, month=self.month, day=self.day, grid=self.grid)
        return f"{self._scheme}://{self._hostname}/{prefix}/{band}.tif"
with S2DigitalTwinReader("57U", 2019, 1, 1) as s2Twin:
    print(s2Twin.bounds)

azimuth line and pixelcount of extracted tile/point

Hi,

As Sentinel-1 data requires pre-processing, e.g. radiometric calibration, thermal noise reduction, speckle filtering, we need to use the metadata files, delivered in the productinfo. As rio-tiler takes a part of the entire image, to derive the metadata, we need to know the Azimuth line and pixelcount of the extracted tile/point.

Does rio-tiler provide these, and if not, could you help me figure out how to get these in a convenient way. It would then good to add this feature to the tiler.

thanks,

Carst

more info on:
https://sentinel.esa.int/documents/247904/685163/S1-Radiometric-Calibration-V1.0.pdf

Unable to run "Sentinel 2 - AWS COG" example

I'm trying to run the following snippet, as reported in the documentation

from rio_tiler_pds.sentinel.aws import S2COGReader

with S2COGReader("S2A_29RKH_20200219_0_L2A") as sentinel:
    print(type(sentinel))

I got the following error:

NoCredentialsError: Unable to locate credentials

In my understanding, sentinel2 COGs are accessible without any AWS credentials.

Moreover, recently, global sentinel 2 COGs was made publicly available. There is some ways to access them trough rio_tiler_pds? If yes, documentation still refer only to Africa related COG.

Issue 1 - Update

as of 6c82b6e the code is just a pure copy of rio-tiler code.

I'd love to update each submodule to use ContextManager as rio_tiler_io.COGReader and rio_tiler_io.STACReader

TO DO

  • update code to use rio_tiler.io.cogeo.multi_* modules (replacement of rio_tiler.reader_multi)
  • decide and change the code architecture cogeotiff/rio-tiler#124 (comment)

The idea behind moving all code here was to ease addition of new PDS (e.g next Landsat release from USGS, Sentinel2 COG....), I'm not sure what's the best way to do that.

forward slash windows breaks Landsat parser

Hi Vincent, great work.

I tried to install a local version on windows, the Landsat tiler metadata/tile breaks because the parser adds control characters. here:
line 125: meta_file = "http://{bucket}.s3.amazonaws.com/{prefix}/{scene}_MTL.txt".format(
**scene_params
)

Parser results in:
http://landsat-pds.s3.amazonaws.com/c1\L8\197\024\LC08_L1TP_197024_20180703_20180717_01_T1/LC08_L1TP_197024_20180703_20180717_01_T1_MTL.txt

while should be:
http://landsat-pds.s3.amazonaws.com/c1/L8/197/024/LC08_L1TP_197024_20180703_20180717_01_T1/LC08_L1TP_197024_20180703_20180717_01_T1_MTL.txt

Thanks

Accessing additional Sentinel 2 products?

I am trying to access the SCL product from Sentinel 2, but it seems like it is not listed in the list of available bands:

with S2COGReader("S2A_28QED_20230514_0_L2A") as sentinel:
    img = sentinel.preview("SCL")

gives

InvalidBandName: SCL is not valid.
Valid bands: ('B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B11', 'B12')

However if I modify the bands to add it as a valid band, everything seems to work fine:

with S2COGReader("S2A_28QED_20230514_0_L2A") as sentinel:
    sentinel.bands += ("SCL",)
    img = sentinel.preview("SCL")
imshow(img.data_as_image())

image

I wanted to check if there was a particular reason these products weren't available (e.g. a correctness issue and thus I should definitely avoid doing this) or if it was perhaps an oversight?

Add top-level __init__.py

Right now, virtually nothing is exported at the top level:

image

It would make sense at a minimum to export other modules

Scene B3DA causing error with no /bounds

Our titiler instance fails when requesting bounds for the following scene:
S1A_S3_GRDH_1SDV_20230807T171941_20230807T172010_049772_05FC2A_B3DA

The object itself is rather strange looking in Scihub. With some weird sort of value-threshold feathering happening beyond the edge of what I would call the bounds of the image.
image

assets -> bands

for this new version I chose to use the word assets instead of bands to better align with STAC. This is in fact confusing and might also not help with migration.

I propose (with @kylebarron) to move back to bands.

Landsat sceneid_parser fails for albers data

Unsure whether this is something that needs to be fixed, or maybe just documented.

sceneid = 'LC08_L2SP_077010_20210616_20210623_02_A1'
sceneid_parser(sceneid)
# InvalidLandsatSceneId: Could not match LC08_L2SP_077010_20210616_20210623_02_A1

Though the data does exist:

> aws s3 ls s3://usgs-landsat/collection02/level-2/albers/oli-tirs/2021/077/010/LC08_L2SP_077010_20210616_20210623_02_A1/ --request-payer
2021-06-22 19:06:09     117266 LC08_L2SP_077010_20210616_20210623_02_A1_ANG.txt
2021-06-22 19:06:09   72555703 LC08_L2SP_077010_20210616_20210623_02_A1_BT_B10.TIF
2021-06-22 19:06:10   73056444 LC08_L2SP_077010_20210616_20210623_02_A1_BT_B11.TIF
2021-06-22 19:06:40       8458 LC08_L2SP_077010_20210616_20210623_02_A1_BT_stac.json
2021-06-22 19:06:34      24467 LC08_L2SP_077010_20210616_20210623_02_A1_MTL.json
2021-06-22 19:06:11      20153 LC08_L2SP_077010_20210616_20210623_02_A1_MTL.txt
2021-06-22 19:06:11      30060 LC08_L2SP_077010_20210616_20210623_02_A1_MTL.xml
2021-06-22 19:06:11    4555577 LC08_L2SP_077010_20210616_20210623_02_A1_QA_PIXEL.TIF
2021-06-22 19:06:11     285318 LC08_L2SP_077010_20210616_20210623_02_A1_QA_RADSAT.TIF
2021-06-22 19:06:11    1612709 LC08_L2SP_077010_20210616_20210623_02_A1_SAA.TIF
2021-06-22 19:06:11   98628574 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B1.TIF
2021-06-22 19:06:13   98888810 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B2.TIF
2021-06-22 19:06:14   99221882 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B3.TIF
2021-06-22 19:06:15  100230500 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B4.TIF
2021-06-22 19:06:16  101047724 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B5.TIF
2021-06-22 19:06:17   82672180 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B6.TIF
2021-06-22 19:06:17   79008495 LC08_L2SP_077010_20210616_20210623_02_A1_SR_B7.TIF
2021-06-22 19:06:18    6094460 LC08_L2SP_077010_20210616_20210623_02_A1_SR_QA_AEROSOL.TIF
2021-06-22 19:06:40      12491 LC08_L2SP_077010_20210616_20210623_02_A1_SR_stac.json
2021-06-22 19:06:19    2353540 LC08_L2SP_077010_20210616_20210623_02_A1_ST_ATRAN.TIF
2021-06-22 19:06:19   75025324 LC08_L2SP_077010_20210616_20210623_02_A1_ST_B10.TIF
2021-06-22 19:06:19   18508425 LC08_L2SP_077010_20210616_20210623_02_A1_ST_CDIST.TIF
2021-06-22 19:06:20    1467008 LC08_L2SP_077010_20210616_20210623_02_A1_ST_DRAD.TIF
2021-06-22 19:06:20   10647256 LC08_L2SP_077010_20210616_20210623_02_A1_ST_EMIS.TIF
2021-06-22 19:06:21   24674455 LC08_L2SP_077010_20210616_20210623_02_A1_ST_EMSD.TIF
2021-06-22 19:06:21   32356524 LC08_L2SP_077010_20210616_20210623_02_A1_ST_QA.TIF
2021-06-22 19:06:22   60622007 LC08_L2SP_077010_20210616_20210623_02_A1_ST_TRAD.TIF
2021-06-22 19:06:22    2018399 LC08_L2SP_077010_20210616_20210623_02_A1_ST_URAD.TIF
2021-06-22 19:06:40      11764 LC08_L2SP_077010_20210616_20210623_02_A1_ST_stac.json
2021-06-22 19:06:22    1047442 LC08_L2SP_077010_20210616_20210623_02_A1_SZA.TIF
2021-06-22 19:06:23   95533127 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B1.TIF
2021-06-22 19:06:23   96512869 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B2.TIF
2021-06-22 19:06:24   97422593 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B3.TIF
2021-06-22 19:06:25   98720561 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B4.TIF
2021-06-22 19:06:26  100035700 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B5.TIF
2021-06-22 19:06:27   80975371 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B6.TIF
2021-06-22 19:06:28   77750728 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B7.TIF
2021-06-22 19:06:29   46371301 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_B9.TIF
2021-06-22 19:06:40      12862 LC08_L2SP_077010_20210616_20210623_02_A1_TOA_stac.json
2021-06-22 19:06:29    8090125 LC08_L2SP_077010_20210616_20210623_02_A1_VAA.TIF
2021-06-22 19:06:30    2393656 LC08_L2SP_077010_20210616_20210623_02_A1_VZA.TIF
2021-06-22 19:06:30      92210 LC08_L2SP_077010_20210616_20210623_02_A1_thumb_large.jpeg
2021-06-22 19:06:30      11016 LC08_L2SP_077010_20210616_20210623_02_A1_thumb_small.jpeg

Support using not with a context manager

Currently there's a significant amount of initialization done in __enter__ instead of __init__. This means that code will silently fail if not used with a context manager:

scene = S2COGReader('S2B_12SYJ_20200901_0_L2A')
# no error
scene.center
# errors:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-77-5a7975690aa4> in <module>
----> 1 self.center

<ipython-input-43-587f5a19de47> in center(self)
     41         """Dataset center + minzoom."""
     42         return (
---> 43             (self.bounds[0] + self.bounds[2]) / 2,
     44             (self.bounds[1] + self.bounds[3]) / 2,
     45             self.minzoom,

AttributeError: 'S2COGReader' object has no attribute 'bounds'

I believe __init__ should be used for any necessary initialization of the class. __enter__ should only instantiate context-specific attributes like a file descriptor or a database cursor. Such a context really doesn't apply here, and __enter__ is essentially on there for syntactical sugar. Hence initialization should be move entirely to __init__.

Inconsistent Sentinel 2 COG band naming between scenes

I noticed I was getting a lot of 404s on my Sentinel 2 tiler. It seems that something has changed in the datasets as the return of the band names is inconsistent between scenes:

Expected:

from rio_tiler_pds.sentinel.aws import S2COGReader

with S2COGReader("S2A_31NBJ_20190524_0_L2A") as sentinel:
   print(type(sentinel))
    > <class 'rio_tiler_pds.sentinel.aws.sentinel2.S2L2ACOGReader'>
    
    print(sentinel.bands)
    > ('B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B11', 'B12')
    
    print([i for i in sentinel.stac_item["assets"]])
    > ['thumbnail', 'overview', 'info', 'metadata', 'visual', 'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B11', 'B12', 'AOT', 'WVP', 'SCL']

When running scene from documentation:

with S2COGReader("S2A_29RKH_20200219_0_L2A") as sentinel:
   print(type(sentinel))
    > <class 'rio_tiler_pds.sentinel.aws.sentinel2.S2L2ACOGReader'>
    
    print(sentinel.bands)
    > ()
    
    print([i for i in sentinel.stac_item["assets"]])
    > ['aot', 'blue', 'coastal', 'granule_metadata', 'green', 'nir', 'nir08', 'nir09', 'red', 'rededge1', 'rededge2', 'rededge3', 'scl', 'swir16', 'swir22', 'thumbnail', 'tileinfo_metadata', 'visual', 'wvp', 'aot-jp2', 'blue-jp2', 'coastal-jp2', 'green-jp2', 'nir-jp2', 'nir08-jp2', 'nir09-jp2', 'red-jp2', 'rededge1-jp2', 'rededge2-jp2', 'rededge3-jp2', 'scl-jp2', 'swir16-jp2', 'swir22-jp2', 'visual-jp2', 'wvp-jp2']

I ran several tests with scenes from different areas and date ranges, but almost all scenes seem to return the wrong band names. The S2COGReader class expects band names to be in the "B[0-9A]{2}" format and therefore returns an empty tuple. It seems that name and common_name are the same for the eo:bands metadata in these scenes.

The scene id parser parses both scenes as expected:

from rio_tiler_pds.sentinel import s2_sceneid_parser

print(s2_sceneid_parser("S2A_31NBJ_20190524_0_L2A"))
> {'sensor': '2', 'satellite': 'A', 'utm': '31', 'lat': 'N', 'sq': 'BJ', 'acquisitionYear': '2019', 'acquisitionMonth': '05', 'acquisitionDay': '24', 'num': '0', 'processingLevel': 'L2A', 'scene': 'S2A_31NBJ_20190524_0_L2A', 'date': '2019-05-24', '_utm': '31', '_month': '5', '_day': '24', '_levelLow': 'l2a'}

print(s2_sceneid_parser("S2A_29RKH_20200219_0_L2A"))
> {'sensor': '2', 'satellite': 'A', 'utm': '29', 'lat': 'R', 'sq': 'KH', 'acquisitionYear': '2020', 'acquisitionMonth': '02', 'acquisitionDay': '19', 'num': '0', 'processingLevel': 'L2A', 'scene': 'S2A_29RKH_20200219_0_L2A', 'date': '2020-02-19', '_utm': '29', '_month': '2', '_day': '19', '_levelLow': 'l2a'}

I'm running rio-tiler-pds version 0.7.0

/bounds of objects that cross the Antimeridian

Typically objects are compact single polygons that return reasonable values [minx, miny, maxx, maxy] when you get their bounding box.

Case of interest:

When an object crosses the antimeridian (i.e. ±180º longitude), the typical behavior is splitting it into a multipolygon with disconnected elements on opposite sides of the map.

Current behavior:

The current implementation of the /bounds endpoint does not treat this correctly... It returns a degenerate bbox that looks like [-180, miny, 180, maxy], spanning the whole globe

Preferred behavior:

Instead, according to the GeoJSON spec: https://datatracker.ietf.org/doc/html/rfc7946#section-5.2 , the bbox should have the "wrong" order for the longitude coordinates (i.e. minx > maxx).
Note that it is not simply enough to return[180, miny, -180, maxy]. The endpoint should actually find the leftmost bound of the rightmost polygon as minx, and the rightmost point of the leftmost polygon as maxx.

A frontend application may then choose to handle this case as best they see fit, for instance see: https://github.com/developmentseed/rio-viz/blob/main/rio_viz/templates/index.html#L1202-L1210

Open question:

I'm not sure how this behavior should be defined in more complex situations, where the object is not convex. For instance, a U-shaped polygon that crosses the antimeridian could result in multipolygon with 3 components.

Example:

/bounds?sceneid=S1A_IW_GRDH_1SDV_20230726T183302_20230726T183327_049598_05F6CA_31E7
returns
[-180.0, 61.06949078480844, 180.0, 62.88226850489882]
because it's productInfo.json looks similar to this:

"footprint" : {
    "type" : "MultiPolygon",
    "coordinates" : [ 
        [ [ [ 180.0, 62.52138872791733 ], [ 179.94979152463551, 62.52658106629492 ], [ 179.46990657679865, 62.57485506167223 ], ..., [ 179.97829453382462, 61.06949078480844 ], [ 180.0, 61.115074335603865 ], [ 180.0, 62.52138872791733 ] ] ], 
        [ [ [ -180.0, 62.52138872791733 ], [ -180.0, 61.115074335603865 ], [ -179.95528668344312, 61.208976568342585 ], ..., [ -179.34123633603335, 62.451941133348306 ], [ -179.81084543337622, 62.50182719954094 ], [ -180.0, 62.52138872791733 ] ] ] ]
  },

[Resolved] deprecate AWS Landsat8 Collection 1

Ref https://lists.osgeo.org/mailman/listinfo/landsat-pds

All, in case you have not see it, you should look over the recent announcement from USGS related to Collection 2 and its availability on AWS.

https://www.usgs.gov/news/usgs-releases-most-advanced-landsat-archive-date
https://www.usgs.gov/core-science-systems/nli/landsat/landsat-commercial-cloud-data-access

First of all, I just want to say how awesome this is! Planet has been managing the dataset on AWS from the beginning and through their work and all of your demonstrated use cases, USGS saw the value of making the data available in a new format via a fundamentally new distribution mechanism (though their existing distribution mechanisms still remain if you’re looking for that).

Since there is now an S3 bucket with Collection 2 data, the full archive and owned directly by USGS, I am proposing that we deprecate the existing landsat-pds bucket. I know that a number of you have come to depend on the landsat-pds bucket and I want to be sensitive to that, so I am proposing a 6 month deprecation period which would put us at an end date (bucket deleted) of July 1, 2021.

The USGS is making the data available in COG format with STAC metadata. There are some prefix changes from landsat-pds and of course the science updates for Collection 2 and the data is being made available in a Requester Pays S3 bucket (https://docs.aws.amazon.com/AmazonS3/latest/dev/RequesterPaysBuckets.html). There is currently no SNS topic available for new data, but I believe a few of us have reached out to USGS about that.

I want to note that this data is owned and managed by USGS, AWS has no control over the data or the bucket. However, I am happy to help anyone work through the implications of switching their workloads to the USGS bucket rather than the landsat-pds bucket.

Also, I want to give a huge thank you to Amit who has managed this resource for the community for the past several years!

I’ll be out for the new two weeks so you might see an OOO response if you reach out to me, but I will look to answer when I am back in the office.

Thanks everyone and hope you are staying safe.

Joe Flasher

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.