Code Monkey home page Code Monkey logo

stackstac's Introduction

StackSTAC

Documentation Status Binder

Turn a list of STAC items into a 4D xarray DataArray (dims: time, band, y, x), including reprojection to a common grid. The array is a lazy Dask array, so loading and processing the data in parallel—locally or on a cluster—is just a compute() call away.

For more information and examples, please see the documentation.

import stackstac
import pystac_client

URL = "https://earth-search.aws.element84.com/v1"
catalog = pystac_client.Client.open(URL)

stac_items = catalog.search(
    intersects=dict(type="Point", coordinates=[-105.78, 35.79]),
    collections=["sentinel-2-l2a"],
    datetime="2020-04-01/2020-05-01"
).get_all_items()

stack = stackstac.stack(stac_items)
print(stack)
<xarray.DataArray 'stackstac-fefccf3d6b2f9922dc658c114e79865b' (time: 13, band: 17, y: 10980, x: 10980)>
dask.array<fetch_raster_window, shape=(13, 17, 10980, 10980), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/24)
  * time                        (time) datetime64[ns] 2020-04-01T18:04:04 ......
    id                          (time) <U24 'S2B_13SDV_20200401_0_L2A' ... 'S...
  * band                        (band) <U8 'overview' 'visual' ... 'WVP' 'SCL'
  * x                           (x) float64 4e+05 4e+05 ... 5.097e+05 5.098e+05
  * y                           (y) float64 4e+06 4e+06 ... 3.89e+06 3.89e+06
    view:off_nadir              int64 0
    ...                          ...
    instruments                 <U3 'msi'
    created                     (time) <U24 '2020-09-05T06:23:47.836Z' ... '2...
    sentinel:sequence           <U1 '0'
    sentinel:grid_square        <U2 'DV'
    title                       (band) object None ... 'Scene Classification ...
    epsg                        int64 32613
Attributes:
    spec:        RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0...
    crs:         epsg:32613
    transform:   | 10.00, 0.00, 399960.00|\n| 0.00,-10.00, 4000020.00|\n| 0.0...
    resolution:  10.0

Once in xarray form, many operations become easy. For example, we can compute a low-cloud weekly mean-NDVI timeseries:

lowcloud = stack[stack["eo:cloud_cover"] < 40]
nir, red = lowcloud.sel(band="B08"), lowcloud.sel(band="B04")
ndvi = (nir - red) / (nir + red)
weekly_ndvi = ndvi.resample(time="1w").mean(dim=("time", "x", "y")).rename("NDVI")
# Call `weekly_ndvi.compute()` to process ~25GiB of raster data in parallel. Might want a dask cluster for that!

Installation

pip install stackstac

Windows notes:

It's a good idea to use conda to handle installing rasterio on Windows. There's considerably more pain involved with GDAL-type installations using pip. Then pip install stackstac.

Things stackstac does for you:

  • Figure out the geospatial parameters from the STAC metadata (if possible): a coordinate reference system, resolution, and bounding box.
  • Transfer the STAC metadata into xarray coordinates for easy indexing, filtering, and provenance of metadata.
  • Efficiently generate a Dask graph for loading the data in parallel.
  • Mediate between Dask's parallelism and GDAL's aversion to it, allowing for fast, multi-threaded reads when possible, and at least preventing segfaults when not.
  • Mask nodata and rescale by STAC item asset scales/offsets.
  • Display data in interactive maps in a notebook, computed on the fly by Dask.

Limitations:

  • Raster data only! We are currently ignoring other types of assets you might find in a STAC (XML/JSON metadata, point clouds, video, etc.).
  • Single-band raster data only! Each band has to be a separate STAC asset—a separate red, green, and blue asset on each Item is great, but a single RGB asset containing a 3-band GeoTIFF is not supported yet.
  • COGs work best. "Normal" GeoTIFFs that aren't internally tiled, or don't have overviews, will see much worse performance. Sidecar files (like .msk files) are ignored for performance. JPEG2000 will probably work, but probably be slow unless you buy kakadu. Formats make a big difference.
  • BYOBlocksize. STAC doesn't offer any metadata about the internal tiling scheme of the data. Knowing it can make IO more efficient, but actually reading the data to figure it out is slow. So it's on you to set this parameter. (But if you don't, things should be fine for any reasonable COG.)
  • Doesn't make geospatial data any easier to work with in xarray. Common operations (picking bands, clipping to bounds, etc.) are tedious to type out. Real geospatial operations (shapestats on a GeoDataFrame, reprojection, etc.) aren't supported at all. rioxarray might help with some of these, but it has limited support for Dask, so be careful you don't kick off a huge computation accidentally.
  • I haven't even written tests yet! Don't use this in production. Or do, I guess. Up to you.

Roadmap:

Short-term:

  • Write tests and add CI (including typechecking)
  • Support multi-band assets
  • Easier access to s3://-style URIs (right now, you'll need to pass in gdal_env=stackstac.DEFAULT_GDAL_ENV.updated(always=dict(session=rio.session.AWSSession(...))))
  • Utility to guess blocksize (open a few assets)
  • Support item assets to provide more useful metadata with collections that use it (like S2 on AWS)
  • Rewrite dask graph generation once the Blockwise IO API settles

Long term (if anyone uses this thing):

  • Support other readers (aiocogeo?) that may perform better than GDAL for specific formats
  • Interactive mapping with xarray_leaflet, made performant with some Dask graph-rewriting tricks to do the initial IO at coarser resolution for lower zoom levels (otherwize zooming out could process terabytes of data)
  • Improve ergonomics of xarray for raster data (in collaboration with rioxarray)
  • Implement core geospatial routines (warp, vectorize, vector stats, GeoPandas/spatialpandas interop) in Dask

stackstac's People

Contributors

aazuspan avatar carderne avatar g2giovanni avatar gjoseph92 avatar jamesoconnor avatar jorge-cervest avatar kylebarron avatar ljstrnadiii avatar richardscottoz avatar robintw avatar rschueder avatar scottyhq avatar sharkinsspatial avatar tomaugspurger avatar yellowcap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stackstac's Issues

ValueError: conflicting sizes for dimension with single item / asset

I haven't looked too closely at this, but I'm wondering if there's some issue with the resampling / resolution handling? In this example I have a single STAC item and I select a single asset from it. stackstac (actually xarray) complains that the sizes of the data and coordinates don't match. (this example requires pip install planetary-computer and pystac, but doesn't need an API token).

import planetary_computer as pc
import stackstac
import requests
import pystac

r = requests.get(
    "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-8-c2-l2/items/LC08_L2SP_046027_20200908_20200919_02_T1"
)

item = pystac.Item.from_dict(r.json())

sitem = pc.sign_assets(item)

stackstac.stack([sitem.to_dict()], assets=["SR_B2"])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-b99acad73e1d> in <module>
     13 sitem = pc.sign_assets(item)
     14 
---> 15 stackstac.stack([sitem.to_dict()], assets=["SR_B2"])

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/stack.py in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, reader)
    290     )
    291 
--> 292     return xr.DataArray(
    293         arr,
    294         *to_coords(

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, indexes, fastpath)
    407             data = _check_data_shape(data, coords, dims)
    408             data = as_compatible_data(data)
--> 409             coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    410             variable = Variable(dims, data, attrs, fastpath=True)
    411             indexes = dict(

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims)
    155         for d, s in zip(v.dims, v.shape):
    156             if s != sizes[d]:
--> 157                 raise ValueError(
    158                     "conflicting sizes for dimension %r: "
    159                     "length %s on the data but length %s on "

ValueError: conflicting sizes for dimension 'x': length 7772 on the data but length 7773 on coordinate 'x'

The correct shape, according to rasterio, is

>>> ds = rasterio.open(sitem.assets["SR_B2"].href)
>>> ds.shape
(7891, 7771)

This could easily be user error :)

Support non-NaN nodata values in `mosaic`

From microsoft/PlanetaryComputerExamples#108 (comment).

Support a nodata=np.nan kwarg to stackstac.mosaic, so you can specify a different nodata value if that's what you're using.

The larger problem here is that we're not storing the nodata value in xarray metadata anywhere—just using the xarray convention that NaN is the standard nodata value. But if you set fill_value, that's no longer the case.

Broadly related to #50: rioxarray has a convention for storing nodata in attributes, which we could follow, it's just convoluted CF-style. A single nodata coordinate might be more sensible to xarray users. (The real answer is native missing-value support in xarray though pydata/xarray#4143 pydata/xarray#3955 pydata/xarray#1194.)

If we had this, mosiac could automatically use the nodata value stored in the data, not requiring you to keep track of it.

cc @TomAugspurger

High-level explanation of how stackstac aligns input data

Users sometimes ask, "how do the geotiffs get aligned into one array"? If you read the stackstac.stack docstring many times, you could probably get a sense of this, but there should be a high-level explanation of the process on the docs.

Specifically, it's good to clarify that stackstac isn't magically aligning/coregistering the GeoTIFFs to each other in some way. It's just picking a reasonable common grid for the items (resolution, CRS, bounds), and then warping each input item to match that grid.

xref #108 (reply in thread)

Investigate large graph warnings

I've noticed large graph warnings from Dask when working with reasonably-sized stackstac DataArrays, like

UserWarning: Large object of size 1.73 MiB detected in task graph:

One thought: what if this is a situation like dask/dask#8008? When we turn the asset table into a Dask array, we're making one chunk per element. What if each of these embedded elements aren't actually size one, but reference the entire memory of the asset table? That would make the serialized size of the asset table N^2!

Naw, I think that's unlikely. Serialization isn't dumb enough to copy the entire buffer even when it's not needed. There's probably some other cruft in there.

Switch notebooks from Coiled to Binder

Coiled is deprecating Notebooks and Jobs (cc @marcosmoyano). The docs currently have links to launch Coiled notebooks. If we switch clusters to use websockets, things should hopefully work from binder. Binder should already work thanks to #43, so this is a matter of:

  • switch notebooks to use protocol="wss" on Coiled clusters (add comment about this)
  • maybe add a section to notebooks for users to authenticate with coiled
  • make sure things work on binder
  • update links in docs (somewhere in here)

    stackstac/docs/conf.py

    Lines 90 to 111 in e5c7e0f

    {% set docname = env.doc2path(env.docname, base=False) %}
    .. note::
    You can view & download the original notebook
    `on Github <https://github.com/gjoseph92/stackstac/blob/main/docs/{{
    "../" + docname if docname.startswith("examples") else docname
    }}>`_.
    Or, `click here <https://cloud.coiled.io/gjoseph92/jobs/stackstac>`_
    to run these notebooks on Coiled with access to Dask clusters.
    """
    # TODO enable binder once Coiled supports websocket clusters over 443.
    # (Binder blocks outbound traffic on all ports besides 22, 80, and 443, so we can't connect to Coiled on 8786.)
    # nbsphinx_prolog = """
    # {% set docname = env.doc2path(env.docname, base=False) %}
    # .. note::
    # You can run this notebook interactively here: |Binder|, or view & download the original
    # `on Github <https://github.com/gjoseph92/stackstac/blob/main/docs/{{ docname }}>`_.
    # .. |Binder| image:: https://mybinder.org/badge_logo.svg
    # :target: https://mybinder.org/v2/gh/gjoseph92/stackstac/main?urlpath=lab/tree/docs/{{ docname }}
    # """
  • add binder badge to readme

Use rasterio's new Python-file support?

With rasterio/rasterio#2141, rasterio will be able to open Python file-like objects (including fsspec). Can/should we take advantage of this in stackstac in any way?

The main thing that comes to mind is making auth easier. Instead of having to set AWS_SECRET_ACCESS_KEY, etc. in the LayeredEnv (example here rmg55/CloudDAAC_Binders#1 (comment)) or manage a .netrc file (#30), perhaps we could let fsspec or something else handle it?

However, our rasterio performance with dask is highly dependent on GDAL using libcurl directly, because we can flip curl's cache on and off when reopening datasets in multiple threads:

VSI_CACHE=True
# ^ cache HTTP requests for opening datasets. This is critical for `ThreadLocalRioDataset`,
# which re-opens the same URL many times---having the request cached makes subsequent `open`s
# in different threads snappy.
),
read=dict(
VSI_CACHE=False
# ^ *don't* cache HTTP requests for actual data. We don't expect to re-request data,
# so this would just blow out the HTTP cache that we rely on to make repeated `open`s fast
# (see above)

If we just blindly used an fsspec object, I imagine the reads for opening a dataset wouldn't be cached?

Maybe the better approach is to use our own file-like object for caching the open reads, and stop relying on VSI_CACHE. Then this could wrap any valid input for a rio.open—including another file-like object.

But there's also the question of UI. Users wouldn't be passing in their own file-like objects anyway; STAC asset URLs are always going to just be strings. We could just use fsspec.open internally for everything, and allow users to pass in fsspec **kwargs? This is straightforward, but also might be hard to figure out for users who don't already know fsspec.

Support dask>=2022

Our dask dependency won't allow us to go beyond 2021:

dask = {extras = ["array"], version = "^2021.4.1"}

Bump this and cut a release. When #106 is unblocked we'll only be able to support the latest version anyway.

Mosaicking

From pangeo-data/cog-best-practices#4 (comment):

Looks like this is focused on "stacking" items/assets/bands (which I think is the most common workflow). Any plans to incorporate the mosaic workflow like @TomAugspurger has put together in stac_vrt (or maybe this is already there and I just missed it)?

I didn't add any mosaicking directly (at the GDAL level), since you can actually do it pretty easily with plain dask/numpy. Something like:

# TODO `fill_value`s besides NaN
def _mosaic(chunk, axis):
    ax_length = chunk.shape[axis]
    if ax_length <= 1:
        return chunk
    out = np.take(chunk, 0, axis=axis)
    for i in range(1, ax_length):
        layer = np.take(chunk, i, axis=axis)
        out = np.where(np.isnan(out), layer, out)
    return out

mosaicked = stack.reduce(_mosaic, dim="time")

As far as I know, there aren't really any advantages to doing the mosaic in GDAL versus in dask. The one advantage GDAL could theoretically have is that it could short-circuit, and stop loading additional datasets as soon as the output image is already fully-filled-in—however, I don't know if GDAL actually implements this logic. And even if it does, the performance gains of early termination would quickly lose out to the cost of loading each dataset serially. Basically, I think you're better off letting dask read everything in parallel, then throwing away some data, compared to worst-case having GDAL read hundreds of datasets in serial.

So short answer: yes, this is focused only on "stacking", because I think of "mosaic" as just one among many reduction operations you might want to do to a stack (mean, median, quality-band mosaic, etc.).

The bigger question is whether offering a mosaic function is in scope for this project. Personally, I'd like to be, but it should probably be on an xarray accessor, which starts to bump up against the territory of rioxarray.

Passing resampling

Without thinking I put resampling="bilinear" and got an error when I called .compute()

Traceback (most recent call last):
  File "carajas.py", line 92, in <module>
    band_medianNP = band_median.compute()
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/xarray/core/dataarray.py", line 899, in compute
    return new.load(**kwargs)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/xarray/core/dataarray.py", line 873, in load
    ds = self._to_temp_dataset().load(**kwargs)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/xarray/core/dataset.py", line 798, in load
    evaluated_data = da.compute(*lazy_data.values(), **kwargs)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/base.py", line 565, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/threaded.py", line 76, in get
    results = get_async(
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/local.py", line 487, in get_async
    raise_exception(exc, tb)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/local.py", line 317, in reraise
    raise exc
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/optimization.py", line 963, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/core.py", line 151, in get
    result = _execute_task(task, cache)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/to_dask.py", line 151, in fetch_raster_window
    data = reader.read(current_window)
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 393, in read
    reader = self.dataset
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 389, in dataset
    self._dataset = self._open()
  File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 355, in _open
    vrt = WarpedVRT(
  File "rasterio/_warp.pyx", line 871, in rasterio._warp.WarpedVRTReaderBase.__init__
TypeError: an integer is required

so I presume you need to pass the integer from the enum? e.g. 1 in this case? Or I am not quite clear on what this keyword is expecting?

Clarification on dependencies minimum versions

I was recently trying to install stackstac alongside the latest tensorflow (conda create -n stackstack-ml tensorflow=2.6 stackstac), but this fails due to incompatible numpy dependencies. In particular

numpy = "^1.20.0"

Translates to numpy >=1.20.0,<2.0.0 in the conda-forge feedstock

And conda search tensorflow-base=2.6 --info is quite strict with numpy >=1.19.2,<1.20

I realize this isn't really a bug with stackstac, but is 1.20 truly the minimum required version? Or just the last version being used during development?

cc @ocefpaf who is the listed maintainer of the conda-forge feedstock.

Revamp Dask Array creation logic to be fully blockwise

When dask/dask#7417 gets in, it may both break the current dask-array-creation hacks, and open the door for a much, much simpler approach: just use da.from_array.

We didn't use da.from_array originally and went through all the current rigamarole because from_array generates a low-level graph, which can be enormous (and slow) for the large datasets we load. But once from_array uses Blockwise, it will be far simpler and more efficient.

We'll just switch to having an array-like class that wraps the asset table and other parameters, and whose __getitem__ basically calls fetch_raster_window. However, it's likely worth thinking about just combining all this into the Reader protocol in some way.

This will also make it easier to do #62 (and may even inadvertently solve it).

Tests

We need them. Very much so.

Jotting down some thoughts on what I'd like testing to look like for this project:

  • Lean on typechecking; only write tests for logic.
  • Use hypothesis as much as possible.
    • I'd like to work off of https://github.com/inspera/jsonschema-typed to make a utility for basically converting JSON schema to a .pyi file of TypedDicts for that schema. From that (and a bit more logic), we can basically have Hypothesis any valid STAC JSON for us (and be sure our typechecking is correct too).
  • Prefer real-world / e2e-ish tests over unit tests. If there's a clear unit to test, test it, but otherwise don't worry about passing through a lot of logic in a test.
  • Add some data fixtures, of both STAC metadata and actual assets that have tripped us up.
  • Run mindeps tests like dask. Unlike dask, it would be great to automate figuring out those versions. Poetry doesn't support this yet, but it's probably possible: python-poetry/poetry#3527.
  • Continue using poetry, not conda. Conda is great for data science when you need system dependencies, but the ability to get deterministic environments from poetry makes it better for developing and testing packages.
  • shed for formatting (probably).

`AttributeError` for valid Stac Items that don't have a 'type' field

Current logic assumes STAC assets always have a mimetype 'type' field, but it's optional according to the Item Spec

import pystac  #1.1.0
import stackstac #0.2.1
item = pystac.read_file('https://cmr.earthdata.nasa.gov/cloudstac/LPCLOUD/collections/HLSL30.v2.0/items/HLS.L30.T10TEM.2021199T185036.v2.0')
print(item.validate())
stackstac.stack(item)
/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/stack.py in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, errors_as_nodata, reader)
    287         )  # type: ignore
    288 
--> 289     asset_table, spec, asset_ids, plain_items = prepare_items(
    290         plain_items,
    291         assets=assets,

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/prepare.py in prepare_items(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds)
     87                 type_strs_by_id[asset_id].add(asset.get("type"))
     88 
---> 89         mimetypes_by_id = {
     90             id: [Mimetype.from_str(t) for t in types]
     91             for id, types in type_strs_by_id.items()

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/prepare.py in <dictcomp>(.0)
     88 
     89         mimetypes_by_id = {
---> 90             id: [Mimetype.from_str(t) for t in types]
     91             for id, types in type_strs_by_id.items()
     92         }

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/prepare.py in <listcomp>(.0)
     88 
     89         mimetypes_by_id = {
---> 90             id: [Mimetype.from_str(t) for t in types]
     91             for id, types in type_strs_by_id.items()
     92         }

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/prepare.py in from_str(cls, mimetype)
     37     @classmethod
     38     def from_str(cls, mimetype: str) -> Mimetype:
---> 39         parts = [p.strip() for p in mimetype.split(";")]
     40         type, *subtype = parts[0].split("/")
     41         if len(subtype) == 0:

AttributeError: 'NoneType' object has no attribute 'split'

pystac ItemCollections unrecognized without installing `stackstac[viz]`

I am trying an example posted here:

lon, lat = -105.78, 35.79
items = catalog.search(
    intersects=dict(type="Point", coordinates=[lon, lat]),
    collections=["sentinel-s2-l2a-cogs"],
    datetime="2020-04-01/2020-05-01"
).get_all_items()
stack = stackstac.stack(items)

and am seeing

TypeError: Unrecognized STAC collection type <class 'pystac.item_collection.ItemCollection'>: <pystac.item_collection.ItemCollection object at 0x7f66a7efabe0>

Any ideas what is up?

Cannot pick a common CRS, since assets have multiple CRSs: asset 'overview'

lib/python3.9/site-packages/stackstac/prepare.py in prepare_items(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds)
142 out_epsg = asset_epsg
143 elif out_epsg != asset_epsg:
--> 144 raise ValueError(
145 f"Cannot pick a common CRS, since assets have multiple CRSs: asset {id!r} of item "
146 f"{item_i} {item['id']!r} is in EPSG:{asset_epsg}, "

ValueError: Cannot pick a common CRS, since assets have multiple CRSs: asset 'overview' of item 1 'S2B_11UQS_20210501_0_L2A' is in EPSG:32611, but assets before it were in EPSG:32612.

Use smaller internal data format when possible

From looking at some examples, it appears that data is always loaded to float64 arrays. For example in https://github.com/gjoseph92/stackstac/blob/5f984b211993380955b5d3f9eba3f3e285f6952c/examples/show.ipynb, loading the RGB bands of a Sentinel 2 asset (rgb = stack.sel(band=["B04", "B03", "B02"]).persist() ) creates an xarray dataset of type float64. It seems to me that you could improve performance (or at least memory usage) if you were able to use a smaller data type when possible.

You could look at the raster:bands object if it exists to optimize the xarray data type. If the extension doesn't exist, or if the bands have mixed dtypes, then fall back to float64?

Index bands by common_name when available?

If the common_name coordinate exists, it might be nice to have the xarray's band dimension indexed by that by default (so you can do sel(band="red"). Or, it might be annoying to have the inconsistency?

Protocol from typing

Hi guys,

I am trying to run the basics notebook on Colab, but it throws this error:
ImportError: cannot import name 'Protocol' from 'typing' (/usr/lib/python3.7/typing.py)
Any advice to fix this?
Thanks in advance, JL

`xy_coords` introduces unexpected (?) pixel shift

Hi all!
Either the shift in pixels that I have observed is actually unexpected and needs to somehow be adjusted in the to_coords function or I'm simply missing something or misunderstanding the concept behind geotransforms and it's only unexpected for me at the moment. Hope someone can help me out either way 🙂

In the following screenshots, the colored image is always the original raster, while the greyscale image is always the one created with these lines of code:

scene_stack = stackstac.stack(items=stac_obj, xy_coords=xy_coords, dtype='float32', rescale=False)
scene_stack.rio.to_raster('./scene_stack.tif')

Transform and projection of the raster: 'proj:transform': [10, 0, 399960, 0, -10, 52000020] & 'proj:epsg': 32632 (UTM 32N)

  1. xy_coords=False
    xycoords_false

  2. xy_coords='center' | pixel shift: 0, 1 (expected: 0.5, -0.5)
    xycoords_center

  3. xy_coords='topleft' | pixel shift: -0.5, 0.5 (expected: 0, 0)
    xycoords_topleft

As expected, no shift is observed in (1). Now with (2) and (3), I'm confused. My original raster is a COG following the usual raster conventions, e.g. being top left aligned. Therefore I'd assume that using the default as in (3) results in no pixel shift at all. I'd also assume that the pixels are shifted by 0.5 and -0.5 in x and y respectively when using 'center' as in (2).

Looking forward to hearing some other thoughts on this!

Cheers, Marco

Support multi-band COGs

This is mentioned in the README, but I thought I'd open an issue that I can link to.

Currently, multi-band cogs are not supported by stackstac.

>>> import pystac
>>> import stackstac

>>> item = pystac.read_file(
...     "https://planetarycomputer.microsoft.com/api/stac/v1/collections/naip/items/fl_m_2608004_nw_17_060_20191215_20200113"
... )
>>> ds = stackstac.stack([item.to_dict()])
>>> ds.shape
(1, 1, 12240, 11040)

That COG has 4 bands, so the shape should be (1, 4, 12240, 11040).

I started a branch, but haven't had a chance to finish it off. I'll post here if / when I pick it up again.

Manage .netrc files for workers not on the same filesystem

Re #29: sounds like we currently need to have a .netrc file for libcurl to access NASA URS data: https://gist.github.com/scottyhq/c4a4e889b58a0a153dd5fb18bad9f3e8#gistcomment-3713665.

I would like to make the auth/env experience much smoother and automatically derive from the local environment in the future, but as a first step, let's just provide a path to make this work.

Ideas:

  • Create a Netrc class that copies your local .netrc into memory, and when used as a contextmanager, writes the contents to a tempfile, and sets the path as the NETRC env var. Creating and deleting the tempfile per asset is probably a bad idea, though
  • Create a worker plugin for copying the local .netrc to a tempfile on workers, and setting NETRC. Probably more reasonable for performance.

Support `proj:wkt2` if `proj:epsg` is null

As an example, see this Landsat Albers scene from the Landsat Collection 2 STAC API:

https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod/collections/landsat-c2l2alb-sr/items/LC08_L2SR_082024_20210806_20210811_02_A1_SR

I hacked stackstac to accept any CRS string in the epsg keyword rather than requiring an EPSG code which is basically just handing it along here:

attrs = {"spec": spec, "crs": f"epsg:{spec.epsg}", "transform": spec.transform}

Would be good to follow the same logic here that the new GDAL STAC Item driver does:
OSGeo/gdal#4138

which allows for:
image

Rather than drop the epsg code may be better to add deprecation warning, and add a new crs keyword. When parsing the STAC it can then construct the right CRS based on what is available for the Item/Asset.

@gjoseph92 I'm happy to issue a PR for this, lmk if I should, don't want to duplicate work if you had already been working on this.

stackstac.stack to support one time coordinate per unique datetime?

Please see below an example running on the Planetary Computer using Esri 10m Land Cover data, where each STAC item is derived from a mosaic of many images. The output is a 4D cube with the time dimension is 4 and the time coordinates are just 2020-06-01T00:00:00 repeated.

The documentation clearly stated that ``time`` will be equal in length to the number of items you pass in, and indexed by STAC Item datetime. But in a more natural way, it's expected that the dataarray should have one time coordinate per unique datetime in the STAC items. Would stackstac.stack support this feature?

import numpy as np
import planetary_computer as pc
import pystac_client
import stackstac

catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1/"
)
point = {"type": "Point", "coordinates": [-97.807733, 33.2133019]}

io_lulc_search = catalog.search(collections=["io-lulc"], intersects=point)
io_lulc_items = [pc.sign(item).to_dict() for item in io_lulc_search.get_items()]
data = stackstac.stack(io_lulc_items, assets=["data"], epsg=3857)

data.shape, np.unique(data.time)

Output: ((4, 1, 185098, 134598), array(['2020-06-01T00:00:00.000000000'], dtype='datetime64[ns]'))

Use reprojected `geometry` instead of `bbox` to determine extent

When an asset or item doesn't have any proj: fields, we're falling back on the WGS84 bbox field on the Item to determine extent, for the purposes of calculating reasonable bounds for the output array:

# There's no bbox, nor shape and transform. The only info we have is `item.bbox` in lat-lon.
else:
if item_bbox_proj is None:
try:
bbox_lonlat = item["bbox"]
except KeyError:
asset_bbox_proj = None
else:
# TODO handle error
asset_bbox_proj = reproject_bounds(
bbox_lonlat, 4326, out_epsg
)
item_bbox_proj = asset_bbox_proj
# ^ so we can reuse for other assets

Reprojecting a lat-lon bbox to the output CRS is less accurate that reprojecting the geometry, then taking its bounds in the output CRS—you're stacking envelopes on envelopes.

The only question is whether to add a dependency to deal with the GeoJSON, or just write something ourselves. I imagine the overhead of making a GEOS object with shapely/pygeos is a bit high when we just need a list of coordinates to pass to pyproj;
could use geojson.utils.coords; but it's simple enough we should probably just implement it ourselves.

Error in bounds_from_affine

I think there is an error at the line:

ur_x, ur_y = af * (0, xsize)

during the calculation of bounds from affine. The upper right point should be calculated with ur_x, ur_y = af * (xsize, 0) and not ur_x, ur_y = af * (0, xsize)

So the following lines of code:

ul_x, ul_y = af * (0, 0)
ll_x, ll_y = af * (0, ysize)
lr_x, lr_y = af * (xsize, ysize)
ur_x, ur_y = af * (0, xsize)

should change into:

ul_x, ul_y = af * (0, 0)
ll_x, ll_y = af * (0, ysize)
lr_x, lr_y = af * (xsize, ysize)
ur_x, ur_y = af * (xsize, 0)

Am I right?

Missing file handling

While looking at data in various areas, have come across missing bands, etc, yesterday this while testing stackstac:

File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 393, in read
reader = self.dataset
File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 389, in dataset
self._dataset = self._open()
File "/home/ubuntu/anaconda3/envs/richard/lib/python3.8/site-packages/stackstac/rio_reader.py", line 330, in _open
ds = SelfCleaningDatasetReader(rio.parse_path(self.url), sharing=False)
File "rasterio/_base.pyx", line 262, in rasterio._base.DatasetBase.init
rasterio.errors.RasterioIOError: HTTP response code: 404

so perhaps can wrap this and report which one it failed on so can remove that and try again - and find out perhaps if a bad url or missing for the data provider end?

500 Internal Server Error when requesting tiles from within docker container

This is a separate issue to discuss the separate problem I was running into last week (partially discussed in #96 and #97). I am running stackstac.show from within a JupyterLab notebook running in a docker container.

When running show, I can see the checkerboard pattern, but not the data itself. When looking at developer tools, I can see that the tiles for the data itself is giving a 500 error:

image

Unfortunately, the content of the response is just:

500 Internal Server Error
Server got itself in trouble

Do you have any suggestions as to how we could debug this further, and get more information out of the server?

Time coordinates are sometimes integers, not datetime64

Noticed this in https://gist.github.com/rmg55/b144cb273d9ccfdf979e9843fdf5e651, and I've had it happen before myself:

Coordinates:
  * time            (time) object 1594404158627000000 ... 1614276155393000000

Pretty sure stackstac is correctly making it into a pandas DatetimeIndex:

"time": pd.to_datetime(
[item["properties"]["datetime"] for item in items],
infer_datetime_format=True,
errors="coerce",
),

but something is going weird when xarray receives that, and it reinterprets it as an object array.

Support chunking multiple assets together in the `time`/`band` dimensions

Currently, stackstac is built around each STAC Asset being its own chunk in the dask array—the time and band dimensions always have a chunksize of 1.

However, there are cases where you might want to load multiple Assets in one chunk of the array. Most commonly, you'd do this when you have a huge graph, need to cut down on tasks, and can give up some granularity. Particularly, you might be happy to combine the time dimension into fewer chunks if you know you're doing a composite right away anyway. See microsoft/PlanetaryComputer#12 (comment) for a motivating example.

So let's support extending the chunksize= argument to stackstac.stack to take up to 4-tuples (time, band, y, x), so you can specify the chunking along all dimensions.

Note that this isn't #66 (though that could be a follow-on): we're not talking about flattening/pre-mosaicing the data. We'd still load every asset as usual, it's just that the chunks of the dask array might be (4, 2, Y, X) instead of always (1, 1, Y, X).

This should be done/considered as a part of #105.

Questions:

  • When a chunk contains multiple assets, should they be loaded serially, or in parallel? We could create our own internal threadpool, since most of the IO is not CPU-bound. However, because we have to duplicate the GDAL Dataset and file-descriptor per-thread, that might be expensive on memory. I suppose the runtime of T threads reading N assets is the same as T threads reading N / C assets, where each read takes C times longer. So probably in serial. Sure would be nice to just have an aiocogeo Reader for this 😁
  • How will combining multiple bands into a single chunk interplay with #62?

Store per-item bounding boxes as a coordinate

If you stack a bunch of items into an xarray (say, 10,000 scenes covering all of CONUS), then spatially slice out just a tiny area, your time dimension will still contain all 10k items, even though the majority of those items probably don't intersect your AOI and will be all NaNs when loaded. Besides bloated Dask graphs, this isn't that big of a performance deal (we have fastpath logic for this case), but it's annoying from a UX perspective: it would be nice to know how many actual scenes you have without computing anything.

If we store the items' bounding-boxes as a coordinate variable (along the time dim), we could then easily drop any items that don't overlap. There could either/both be a convenience function to do the spatial indexing for you, or a stackstac.drop_non_overlapping function, which looks at the current bounds of a DataArray (based on min/max of its x and y dims) and drops items that fall outside those).

The only annoying thing is we'll need to force NumPy to make a 1d array of 4-tuples (object dtype), since xarray won't allow us to have a coordinate variable with extraneous dimensions. Or maybe a record array could work?

Run tests in CI

There are some tests now, but they're just not hooked up to GitHub Actions yet.

Make sure to set env vars so the HYPOTHESIS_PROFILE is selected correctly:

settings.load_profile(os.getenv("HYPOTHESIS_PROFILE", "default"))

The main task will probably be getting the environment set up correctly with Poetry.

stackstac.show gives empty map

I'm running into a problem with stackstac.show. I've got a simple notebook that grabs some data from Planetary Computer, mosaics it, and then shows it on a map using stackstac.show. This works fine when I run it on Microsoft Planetary Computer, but when I run it locally, I don't get any data (or any checkerboard pattern) on the map - I just get the background OpenStreetMap data.

The key parts of the code are below, and the full ipynb file is available here.

search = stac_client.search(
        collections=["cop-dem-glo-30"],
        bbox=bounds)

items = list(search.get_items())

data = stackstac.stack([item.to_dict() for item in items], bounds_latlon=bounds)

mosaic = data.max(dim='time').squeeze()

mosaic = mosaic.persist()

stackstac.show(mosaic, checkerboard=True)

Am I doing something wrong here when running locally, or have I discovered a bug? When running the show call I can see various Dask jobs being run when looking at the Dask dashboard - which suggests that something is happening - but the result never seems to be displayed.

In case it is relevant, I'm running on Windows 10, with the following in my conda enviroment:

# packages in environment at C:\Users\rwilson3\Documents\mambaforge\envs\anglo:
#
# Name                    Version                   Build  Channel
abseil-cpp                20210324.2           h0e60522_0    conda-forge
affine                    2.3.0                      py_0    conda-forge
aiohttp                   3.8.1                    pypi_0    pypi
aiosignal                 1.2.0                    pypi_0    pypi
alabaster                 0.7.12                     py_0    conda-forge
alembic                   1.7.3                    pypi_0    pypi
altair                    4.1.0                    pypi_0    pypi
anyio                     3.3.2                    pypi_0    pypi
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
argon2-cffi               20.1.0           py39hb82d6ee_2    conda-forge
arrow-cpp                 5.0.0           py39he0f88eb_8_cpu    conda-forge
asciitree                 0.3.3                      py_2    conda-forge
asgiref                   3.4.1                    pypi_0    pypi
astor                     0.8.1                    pypi_0    pypi
async-timeout             4.0.1                    pypi_0    pypi
async_generator           1.10                       py_0    conda-forge
asyncio                   3.4.3                    pypi_0    pypi
asyncpg                   0.22.0                   pypi_0    pypi
atomicwrites              1.4.0                    pypi_0    pypi
attrs                     21.2.0             pyhd8ed1ab_0    conda-forge
aws-c-cal                 0.5.11               he19cf47_0    conda-forge
aws-c-common              0.6.2                h8ffe710_0    conda-forge
aws-c-event-stream        0.2.7               h70e1b0c_13    conda-forge
aws-c-io                  0.10.5               h2fe331c_0    conda-forge
aws-checksums             0.1.11               h1e232aa_7    conda-forge
aws-sdk-cpp               1.8.186              hb0612c5_3    conda-forge
azure-core                1.20.1                   pypi_0    pypi
azure-storage-blob        12.9.0                   pypi_0    pypi
babel                     2.9.1              pyh44b312d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports-entry-points-selectable 1.1.0                    pypi_0    pypi
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
base58                    2.1.0                    pypi_0    pypi
basemap                   1.2.2            py39h381b4b0_3    conda-forge
beautifulsoup4            4.10.0                   pypi_0    pypi
black                     21.9b0                   pypi_0    pypi
blas                      2.111                       mkl    conda-forge
blas-devel                3.9.0              11_win64_mkl    conda-forge
bleach                    4.1.0              pyhd8ed1ab_0    conda-forge
blinker                   1.4                      pypi_0    pypi
blosc                     1.21.0               h0e60522_0    conda-forge
bokeh                     2.4.0            py39hcbf5309_0    conda-forge
boost-cpp                 1.74.0               h5b4e17d_4    conda-forge
boto3                     1.18.53                  pypi_0    pypi
botocore                  1.21.53                  pypi_0    pypi
braceexpand               0.1.7                    pypi_0    pypi
branca                    0.4.2              pyhd8ed1ab_0    conda-forge
brotli                    1.0.9                    pypi_0    pypi
brotli-asgi               1.1.0                    pypi_0    pypi
brotli-bin                1.0.9                h8ffe710_5    conda-forge
brotlipy                  0.7.0           py39hb82d6ee_1001    conda-forge
bs4                       0.0.1                    pypi_0    pypi
buildpg                   0.3                      pypi_0    pypi
bzip2                     1.0.8                h8ffe710_4    conda-forge
c-ares                    1.17.2               h8ffe710_0    conda-forge
ca-certificates           2021.10.8            h5b45459_0    conda-forge
cachetools                4.2.4              pyhd8ed1ab_0    conda-forge
cachey                    0.2.1              pyh9f0ad1d_0    conda-forge
cairo                     1.16.0            hb19e0ff_1008    conda-forge
cartopy                   0.20.0           py39h381b4b0_0    conda-forge
certifi                   2021.10.8        py39hcbf5309_1    conda-forge
cffi                      1.14.6           py39h0878f49_1    conda-forge
cfgv                      3.3.1                    pypi_0    pypi
cfitsio                   3.470                h0af3d06_7    conda-forge
cftime                    1.5.1            py39h5d4886f_0    conda-forge
chardet                   4.0.0            py39hcbf5309_1    conda-forge
charls                    2.2.0                h39d44d4_0    conda-forge
charset-normalizer        2.0.6                    pypi_0    pypi
click                     7.1.2              pyh9f0ad1d_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.6.0              pyh9f0ad1d_0    conda-forge
cloudpickle               2.0.0              pyhd8ed1ab_0    conda-forge
cogeo-mosaic              3.0.2                    pypi_0    pypi
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
colorcet                  2.0.6              pyhd8ed1ab_0    conda-forge
colorlog                  6.6.0                    pypi_0    pypi
coverage                  6.0.2                    pypi_0    pypi
cramjam                   2.4.0                    pypi_0    pypi
cryptography              3.4.7            py39hd8d06c1_0    conda-forge
cudatoolkit               10.2.89              hb195166_9    conda-forge
curl                      7.79.1               h789b8ee_1    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.24                  pypi_0    pypi
cytoolz                   0.11.0           py39hb82d6ee_3    conda-forge
dask                      2021.9.1           pyhd8ed1ab_0    conda-forge
dask-core                 2021.9.1           pyhd8ed1ab_0    conda-forge
dask-image                0.6.0                    pypi_0    pypi
datacube                  1.8.6              pyhd8ed1ab_0    conda-forge
datashader                0.13.0             pyh6c4a22f_0    conda-forge
datashape                 0.5.4                      py_1    conda-forge
debugpy                   1.4.1            py39h415ef7b_0    conda-forge
decorator                 5.1.0              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
distlib                   0.3.3                    pypi_0    pypi
distributed               2021.9.1         py39hcbf5309_0    conda-forge
docstring-parser          0.7.3                    pypi_0    pypi
docstring_parser          0.12               pyhd8ed1ab_0    conda-forge
docutils                  0.17.1           py39hcbf5309_0    conda-forge
easyprocess               0.3                      pypi_0    pypi
entrypoint2               0.2.4                    pypi_0    pypi
entrypoints               0.3             py39hde42818_1002    conda-forge
expat                     2.4.1                h39d44d4_0    conda-forge
falcon                    2.0.0                    pypi_0    pypi
fastapi                   0.67.0                   pypi_0    pypi
fastapi-utils             0.2.1                    pypi_0    pypi
fasteners                 0.16               pyhd8ed1ab_0    conda-forge
filelock                  3.3.0                    pypi_0    pypi
fiona                     1.8.20           py39hea8b339_1    conda-forge
folium                    0.12.1                   pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            h1989441_1005    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
freetype                  2.10.4               h546665d_1    conda-forge
freetype-py               2.2.0              pyh9f0ad1d_0    conda-forge
freexl                    1.0.6                ha8e266a_0    conda-forge
frozenlist                1.2.0                    pypi_0    pypi
fsspec                    2021.10.0          pyhd8ed1ab_0    conda-forge
gdal                      3.3.2            py39h7c9a9b1_2    conda-forge
geoalchemy2               0.7.0                    pypi_0    pypi
geojson-pydantic          0.3.1                    pypi_0    pypi
geopandas                 0.10.0             pyhd8ed1ab_0    conda-forge
geopandas-base            0.10.0             pyha770c72_0    conda-forge
geos                      3.9.1                h39d44d4_2    conda-forge
geotiff                   1.7.0                ha8a8a2d_0    conda-forge
gettext                   0.19.8.1          ha2e2712_1008    conda-forge
gflags                    2.2.2             ha925a31_1004    conda-forge
ghp-import                2.0.2                    pypi_0    pypi
giflib                    5.2.1                h8d14728_2    conda-forge
gitdb                     4.0.7                    pypi_0    pypi
gitpython                 3.1.24                   pypi_0    pypi
glog                      0.5.0                h4797de2_0    conda-forge
greenlet                  1.1.2            py39h415ef7b_0    conda-forge
grpc-cpp                  1.40.0               h2431d41_2    conda-forge
h11                       0.12.0                   pypi_0    pypi
hdf4                      4.2.15               h0e5069d_3    conda-forge
hdf5                      1.12.1          nompi_h2a0e4a3_100    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
holoviews                 1.14.6             pyhd8ed1ab_0    conda-forge
hsluv                     5.0.2              pyh44b312d_0    conda-forge
httpcore                  0.13.7                   pypi_0    pypi
httpx                     0.20.0                   pypi_0    pypi
hug                       2.6.1                    pypi_0    pypi
hvplot                    0.7.3              pyh6c4a22f_0    conda-forge
icu                       68.1                 h0e60522_0    conda-forge
identify                  2.3.0                    pypi_0    pypi
idna                      3.2                      pypi_0    pypi
imagecodecs               2021.6.8         py39h166567b_0    conda-forge
imageio                   2.9.0                      py_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        4.8.1            py39hcbf5309_0    conda-forge
importlib_metadata        4.8.1                hd8ed1ab_0    conda-forge
iniconfig                 1.1.1                    pypi_0    pypi
intel-openmp              2021.3.0          h57928b3_3372    conda-forge
ipykernel                 6.4.1            py39h832f523_0    conda-forge
ipyleaflet                0.13.6                   pypi_0    pypi
ipyspin                   0.1.5                    pypi_0    pypi
ipython                   7.28.0           py39h832f523_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipyurl                    0.1.2                    pypi_0    pypi
ipywidgets                7.6.5              pyhd8ed1ab_0    conda-forge
iso-639                   0.4.5                    pypi_0    pypi
iso3166                   2.0.2                    pypi_0    pypi
isodate                   0.6.0                    pypi_0    pypi
jbig                      2.1               h8d14728_2003    conda-forge
jedi                      0.18.0           py39hcbf5309_2    conda-forge
jinja2                    2.11.3                   pypi_0    pypi
jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h8ffe710_0    conda-forge
json5                     0.9.6                    pypi_0    pypi
jsonschema                4.0.1              pyhd8ed1ab_0    conda-forge
jupyter                   1.0.0            py39hcbf5309_6    conda-forge
jupyter-server            1.11.1                   pypi_0    pypi
jupyter_client            6.1.12             pyhd8ed1ab_0    conda-forge
jupyter_console           6.4.0              pyhd8ed1ab_1    conda-forge
jupyter_core              4.8.1            py39hcbf5309_0    conda-forge
jupyterlab                3.1.17                   pypi_0    pypi
jupyterlab-server         2.8.2                    pypi_0    pypi
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_widgets        1.0.2              pyhd8ed1ab_0    conda-forge
jxrlib                    1.1                  h8ffe710_2    conda-forge
kealib                    1.4.14               h8995ca9_3    conda-forge
kiwisolver                1.3.2            py39h2e07f2f_0    conda-forge
krb5                      1.19.2               hbae68bd_2    conda-forge
lark-parser               0.12.0             pyhd8ed1ab_0    conda-forge
lcms2                     2.12                 h2a16943_0    conda-forge
lerc                      2.2.1                h0e60522_0    conda-forge
libaec                    1.0.6                h39d44d4_0    conda-forge
libblas                   3.9.0              11_win64_mkl    conda-forge
libbrotlicommon           1.0.9                h8ffe710_5    conda-forge
libbrotlidec              1.0.9                h8ffe710_5    conda-forge
libbrotlienc              1.0.9                h8ffe710_5    conda-forge
libcblas                  3.9.0              11_win64_mkl    conda-forge
libclang                  11.1.0          default_h5c34c98_1    conda-forge
libcurl                   7.79.1               h789b8ee_1    conda-forge
libdeflate                1.7                  h8ffe710_5    conda-forge
libffi                    3.4.2                h0e60522_4    conda-forge
libgdal                   3.3.2                hfb14b67_2    conda-forge
libglib                   2.68.4               h3be07f2_1    conda-forge
libiconv                  1.16                 he774522_0    conda-forge
libkml                    1.3.0             h9859afa_1014    conda-forge
liblapack                 3.9.0              11_win64_mkl    conda-forge
liblapacke                3.9.0              11_win64_mkl    conda-forge
libnetcdf                 4.8.1           nompi_h1cc8e9d_101    conda-forge
libpng                    1.6.37               h1d00b33_2    conda-forge
libpq                     13.3                 hfcc5ef8_0    conda-forge
libprotobuf               3.18.1               h7755175_0    conda-forge
librttopo                 1.1.0                hb340de5_6    conda-forge
libsodium                 1.0.18               h8d14728_1    conda-forge
libspatialindex           1.9.3                h39d44d4_4    conda-forge
libspatialite             5.0.1                h762a7f4_6    conda-forge
libssh2                   1.10.0               h680486a_2    conda-forge
libthrift                 0.15.0               h636ae23_1    conda-forge
libtiff                   4.3.0                h0c97f57_1    conda-forge
libutf8proc               2.6.1                hcb41399_0    conda-forge
libuv                     1.42.0               h8ffe710_0    conda-forge
libwebp-base              1.2.1                h8ffe710_0    conda-forge
libxml2                   2.9.12               hf5bbc77_0    conda-forge
libzip                    1.8.0                hfed4ece_1    conda-forge
libzlib                   1.2.11            h8ffe710_1013    conda-forge
libzopfli                 1.0.3                h0e60522_0    conda-forge
llvmlite                  0.37.0           py39ha0cd8c8_0    conda-forge
locket                    0.2.0                      py_2    conda-forge
loguru                    0.5.3                    pypi_0    pypi
lxml                      4.6.4                    pypi_0    pypi
lz4-c                     1.9.3                h8ffe710_1    conda-forge
m2w64-gcc-libgfortran     5.3.0                         6    conda-forge
m2w64-gcc-libs            5.3.0                         7    conda-forge
m2w64-gcc-libs-core       5.3.0                         7    conda-forge
m2w64-gmp                 6.1.0                         2    conda-forge
m2w64-libwinpthread-git   5.0.0.4634.697f757               2    conda-forge
magicgui                  0.3.2              pyhd8ed1ab_0    conda-forge
mako                      1.1.5                    pypi_0    pypi
mapclassify               2.4.3              pyhd8ed1ab_0    conda-forge
markdown                  3.3.4              pyhd8ed1ab_0    conda-forge
markupsafe                2.0.1            py39hb82d6ee_0    conda-forge
matplotlib-base           3.4.3            py39h581301d_1    conda-forge
matplotlib-inline         0.1.3              pyhd8ed1ab_0    conda-forge
mercantile                1.2.1                    pypi_0    pypi
mergedeep                 1.3.4                    pypi_0    pypi
mistune                   0.8.4           py39hb82d6ee_1004    conda-forge
mkdocs                    1.2.2                    pypi_0    pypi
mkdocs-material           7.3.2                    pypi_0    pypi
mkdocs-material-extensions 1.0.3                    pypi_0    pypi
mkl                       2021.3.0           hb70f87d_564    conda-forge
mkl-devel                 2021.3.0           h57928b3_565    conda-forge
mkl-include               2021.3.0           hb70f87d_564    conda-forge
monotonic                 1.5                        py_0    conda-forge
morecantile               2.1.4                    pypi_0    pypi
msgpack-python            1.0.2            py39h2e07f2f_1    conda-forge
msrest                    0.6.21                   pypi_0    pypi
mss                       6.1.0                    pypi_0    pypi
msys2-conda-epoch         20160418                      1    conda-forge
multidict                 5.2.0                    pypi_0    pypi
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
mutagen                   1.45.1                   pypi_0    pypi
mypy-extensions           0.4.3                    pypi_0    pypi
napari                    0.4.12             pyhd8ed1ab_0    conda-forge
napari-console            0.0.4              pyhd8ed1ab_0    conda-forge
napari-plugin-engine      0.2.0            py39hcbf5309_0    conda-forge
napari-svg                0.1.5              pyhd8ed1ab_0    conda-forge
nbclassic                 0.3.2                    pypi_0    pypi
nbclient                  0.5.4              pyhd8ed1ab_0    conda-forge
nbconvert                 6.2.0            py39hcbf5309_0    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
nest-asyncio              1.5.1              pyhd8ed1ab_0    conda-forge
netcdf4                   1.5.7           nompi_py39hf113b1f_103    conda-forge
networkx                  2.5                        py_0    conda-forge
nodeenv                   1.6.0                    pypi_0    pypi
notebook                  6.4.4              pyha770c72_0    conda-forge
numba                     0.54.0           py39hb8cd55e_0    conda-forge
numcodecs                 0.9.1            py39h415ef7b_1    conda-forge
numexpr                   2.7.3                    pypi_0    pypi
numpy                     1.20.0           py39h6635163_0    conda-forge
numpydoc                  1.1.0                      py_1    conda-forge
oauthlib                  3.1.1                    pypi_0    pypi
odc-algo                  0.2.0a4                  pypi_0    pypi
odc-io                    0.2.0a1                  pypi_0    pypi
odc-stac                  0.2.0a8                  pypi_0    pypi
olefile                   0.46               pyh9f0ad1d_1    conda-forge
opencv-python             4.5.4.58                 pypi_0    pypi
openjpeg                  2.4.0                hb211442_1    conda-forge
openssl                   1.1.1l               h8ffe710_0    conda-forge
orjson                    3.6.4                    pypi_0    pypi
packaging                 21.0               pyhd8ed1ab_0    conda-forge
pafy                      0.5.5                    pypi_0    pypi
pandas                    1.2.5                    pypi_0    pypi
pandoc                    2.14.2               h8ffe710_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
panel                     0.12.4             pyhd8ed1ab_0    conda-forge
param                     1.11.1             pyh6c4a22f_0    conda-forge
parquet-cpp               1.5.1                         1    conda-forge
parso                     0.8.2              pyhd8ed1ab_0    conda-forge
partd                     1.2.0              pyhd8ed1ab_0    conda-forge
pathspec                  0.9.0                    pypi_0    pypi
pcre                      8.45                 h0e60522_0    conda-forge
pdocs                     1.1.1                    pypi_0    pypi
pickleshare               0.7.5           py39hde42818_1002    conda-forge
pillow                    8.3.2            py39h916092e_0    conda-forge
pims                      0.5                      pypi_0    pypi
pint                      0.18               pyhd8ed1ab_0    conda-forge
pip                       21.2.4             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h8ffe710_0    conda-forge
platformdirs              2.4.0                    pypi_0    pypi
plotext                   2.3.1                    pypi_0    pypi
pluggy                    1.0.0                    pypi_0    pypi
pooch                     1.5.2              pyhd8ed1ab_0    conda-forge
poppler                   21.09.0              h24fffdf_3    conda-forge
poppler-data              0.4.11               hd8ed1ab_0    conda-forge
postgresql                13.3                 h1c22c4f_0    conda-forge
pre-commit                2.15.0                   pypi_0    pypi
proj                      8.0.1                h1cfcee9_0    conda-forge
prometheus_client         0.11.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.20             pyha770c72_0    conda-forge
prompt_toolkit            3.0.20               hd8ed1ab_0    conda-forge
protobuf                  3.18.1                   pypi_0    pypi
psutil                    5.8.0            py39hb82d6ee_1    conda-forge
psycopg2                  2.9.1            py39h0878f49_0    conda-forge
psycopg2-binary           2.9.1                    pypi_0    pypi
psygnal                   0.1.4            py39h2e07f2f_0    conda-forge
py                        1.10.0                   pypi_0    pypi
pyarrow                   5.0.0           py39hf9247be_8_cpu    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pycryptodome              3.11.0                   pypi_0    pypi
pycryptodomex             3.11.0                   pypi_0    pypi
pyct                      0.4.6                      py_0    conda-forge
pyct-core                 0.4.6                      py_0    conda-forge
pydantic                  1.8.2            py39hb82d6ee_0    conda-forge
pydeck                    0.7.0                    pypi_0    pypi
pyee                      8.2.2                    pypi_0    pypi
pygeos                    0.10.2                   pypi_0    pypi
pygments                  2.10.0             pyhd8ed1ab_0    conda-forge
pymdown-extensions        9.0                      pypi_0    pypi
pyopengl                  3.1.5                      py_0    conda-forge
pyopenssl                 21.0.0             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pypgstac                  0.3.4                    pypi_0    pypi
pyppeteer                 0.2.6                    pypi_0    pypi
pyproj                    3.2.1            py39ha996c60_2    conda-forge
pyqt                      5.12.3           py39hcbf5309_7    conda-forge
pyqt-impl                 5.12.3           py39h415ef7b_7    conda-forge
pyqt5-sip                 4.19.18          py39h415ef7b_7    conda-forge
pyqtchart                 5.12             py39h415ef7b_7    conda-forge
pyqtwebengine             5.12.1           py39h415ef7b_7    conda-forge
pyrsistent                0.17.3           py39hb82d6ee_2    conda-forge
pyscreenshot              3.0                      pypi_0    pypi
pyshp                     2.1.3              pyh44b312d_0    conda-forge
pysocks                   1.7.1            py39hcbf5309_3    conda-forge
pystac                    1.2.0                    pypi_0    pypi
pystac-client             0.3.0                    pypi_0    pypi
pytest                    6.2.5                    pypi_0    pypi
pytest-asyncio            0.16.0                   pypi_0    pypi
pytest-cov                3.0.0                    pypi_0    pypi
python                    3.9.7           h7840368_3_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-dotenv             0.19.0                   pypi_0    pypi
python-snappy             0.6.0            py39h1d87f24_0    conda-forge
python_abi                3.9                      2_cp39    conda-forge
pytorch                   1.10.0          py3.9_cuda10.2_cudnn7_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pytz-deprecation-shim     0.1.0.post0              pypi_0    pypi
pyviz_comms               2.1.0              pyhd8ed1ab_0    conda-forge
pywavelets                1.1.1            py39h5d4886f_3    conda-forge
pywin32                   301              py39hb82d6ee_0    conda-forge
pywinpty                  1.1.4            py39h99910a6_0    conda-forge
pyyaml                    5.4.1            py39hb82d6ee_1    conda-forge
pyyaml-env-tag            0.1                      pypi_0    pypi
pyzmq                     22.3.0           py39he46f08e_0    conda-forge
qt                        5.12.9               h5909a2a_4    conda-forge
qtconsole                 5.1.1              pyhd8ed1ab_0    conda-forge
qtpy                      1.11.2             pyhd8ed1ab_0    conda-forge
rasterio                  1.2.8            py39h85efae1_0    conda-forge
re2                       2021.09.01           h0e60522_0    conda-forge
regex                     2021.9.30                pypi_0    pypi
requests                  2.26.0             pyhd8ed1ab_0    conda-forge
requests-oauthlib         1.3.0                    pypi_0    pypi
requests-unixsocket       0.2.0                    pypi_0    pypi
retrying                  1.3.3                      py_2    conda-forge
rfc3986                   1.5.0                    pypi_0    pypi
rio-cogeo                 2.3.1                    pypi_0    pypi
rio-color                 1.0.4                    pypi_0    pypi
rio-mucho                 1.0.0                    pypi_0    pypi
rio-stac                  0.3.1                    pypi_0    pypi
rio-tiler                 2.1.3                    pypi_0    pypi
rio-tiler-pds             0.5.2                    pypi_0    pypi
rio-toa                   0.3.0                    pypi_0    pypi
rio-viz                   0.7.2                    pypi_0    pypi
rioxarray                 0.7.1              pyhd8ed1ab_0    conda-forge
rtree                     0.9.7            py39h09fdee3_2    conda-forge
s3transfer                0.5.0              pyhd8ed1ab_0    conda-forge
scikit-image              0.18.3                   pypi_0    pypi
scikit-learn              1.0              py39h74df8f2_1    conda-forge
scipy                     1.7.1            py39hc0c34ad_0    conda-forge
send2trash                1.8.0              pyhd8ed1ab_0    conda-forge
setuptools                58.0.4           py39hcbf5309_2    conda-forge
shapely                   1.7.1            py39haadaec5_5    conda-forge
simplejpeg                1.6.2                    pypi_0    pypi
simplejson                3.17.5                   pypi_0    pypi
six                       1.16.0             pyh6c4a22f_0    conda-forge
slicerator                1.0.0                    pypi_0    pypi
smart-open                4.2.0                    pypi_0    pypi
smmap                     4.0.0                    pypi_0    pypi
snappy                    1.1.8                ha925a31_3    conda-forge
sniffio                   1.2.0                    pypi_0    pypi
snowballstemmer           2.1.0              pyhd8ed1ab_0    conda-forge
snuggs                    1.4.7                      py_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.2.1                    pypi_0    pypi
spatialpandas             0.4.3              pyhd8ed1ab_0    conda-forge
sphinx                    4.2.0              pyh6c4a22f_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    2.0.0              pyhd8ed1ab_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.5              pyhd8ed1ab_0    conda-forge
sqlakeyset                1.0.1629029818           pypi_0    pypi
sqlalchemy                1.3.23                   pypi_0    pypi
sqlite                    3.36.0               h8ffe710_2    conda-forge
stac-fastapi-api          2.1.1                     dev_0    <develop>
stac-fastapi-extensions   2.1.1                     dev_0    <develop>
stac-fastapi-pgstac       2.1.1                     dev_0    <develop>
stac-fastapi-sqlalchemy   2.1.1                     dev_0    <develop>
stac-fastapi-types        2.1.1                     dev_0    <develop>
stac-nb                   0.4.0                    pypi_0    pypi
stac-pydantic             2.0.1                    pypi_0    pypi
stackstac                 0.2.1              pyhd8ed1ab_0    conda-forge
stacterm                  0.1.0                    pypi_0    pypi
starlette                 0.14.2                   pypi_0    pypi
starlette-cramjam         0.1.0                    pypi_0    pypi
streamlink                2.4.0                    pypi_0    pypi
streamlit                 1.0.0                    pypi_0    pypi
streamlit-folium          0.4.0                    pypi_0    pypi
supermercado              0.2.0                    pypi_0    pypi
superqt                   0.2.4              pyhd8ed1ab_0    conda-forge
tbb                       2021.3.0             h2d74725_0    conda-forge
tblib                     1.7.0              pyhd8ed1ab_0    conda-forge
terminado                 0.12.1           py39hcbf5309_0    conda-forge
termtables                0.2.4                    pypi_0    pypi
testpath                  0.5.0              pyhd8ed1ab_0    conda-forge
threadpoolctl             3.0.0              pyh8a188c0_0    conda-forge
tifffile                  2021.8.30                pypi_0    pypi
tiledb                    2.3.4                h78dabda_0    conda-forge
titiler-application       0.3.11                    dev_0    <develop>
titiler-core              0.3.11                    dev_0    <develop>
titiler-mosaic            0.3.11                    dev_0    <develop>
titiler-pgstac            0.1.0a1                   dev_0    <develop>
tk                        8.6.11               h8ffe710_1    conda-forge
toml                      0.10.2                   pypi_0    pypi
tomli                     1.2.1                    pypi_0    pypi
toolz                     0.11.1                     py_0    conda-forge
torchvision               0.11.1               py39_cu102    pytorch
tornado                   6.1              py39hb82d6ee_1    conda-forge
tqdm                      4.62.3             pyhd8ed1ab_0    conda-forge
traitlets                 5.1.0              pyhd8ed1ab_0    conda-forge
traittypes                0.2.1                    pypi_0    pypi
typer                     0.3.2                    pypi_0    pypi
typing-extensions         3.10.0.2             hd8ed1ab_0    conda-forge
typing_extensions         3.10.0.2           pyha770c72_0    conda-forge
tzdata                    2021.2.post0             pypi_0    pypi
tzlocal                   4.0                      pypi_0    pypi
ucrt                      10.0.20348.0         h57928b3_0    conda-forge
urllib3                   1.26.7             pyhd8ed1ab_0    conda-forge
uvicorn                   0.15.0                   pypi_0    pypi
validators                0.18.2                   pypi_0    pypi
vc                        14.2                 hb210afc_5    conda-forge
vidgear                   0.2.3                    pypi_0    pypi
virtualenv                20.8.1                   pypi_0    pypi
vispy                     0.9.2            py39h5d4886f_0    conda-forge
vs2015_runtime            14.29.30037          h902a5da_5    conda-forge
watchdog                  2.1.6                    pypi_0    pypi
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          1.2.1                    pypi_0    pypi
websockets                9.1                      pypi_0    pypi
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
widgetsnbextension        3.5.1            py39hcbf5309_4    conda-forge
win32-setctime            1.0.3                    pypi_0    pypi
win_inet_pton             1.1.0            py39hcbf5309_2    conda-forge
winpty                    0.4.3                         4    conda-forge
wrapt                     1.13.2           py39hb82d6ee_0    conda-forge
xarray                    0.19.0             pyhd8ed1ab_1    conda-forge
xarray-leaflet            0.1.15                   pypi_0    pypi
xarray-spatial            0.2.9              pyhd8ed1ab_0    conda-forge
xerces-c                  3.2.3                h0e60522_2    conda-forge
xyzservices               2021.9.1           pyhd8ed1ab_0    conda-forge
xz                        5.2.5                h62dcd97_1    conda-forge
yaml                      0.2.5                he774522_0    conda-forge
yarl                      1.7.2                    pypi_0    pypi
yt-dlp                    2021.11.10.1             pypi_0    pypi
zarr                      2.10.1             pyhd8ed1ab_0    conda-forge
zeromq                    4.3.4                h0e60522_1    conda-forge
zfp                       0.5.5                h0e60522_7    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.6.0              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h8ffe710_1013    conda-forge
zstd                      1.5.0                h6255e5f_0    conda-forge

Error when attempting to bind on address ::1

I'm running stackstac from a Jupyter instance running inside a docker container. When calling stackstac.show I get the following error:

/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/show.py:484: UserWarning: Calculating 2nd and 98th percentile of the entire array, since no range was given. This could be expensive!
  warnings.warn(
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <zmq.eventloop.ioloop.ZMQIOLoop object at 0x7f6c083fabe0>>, <Task finished name='Task-51' coro=<_launch_server.<locals>.run() done, defined at /srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/show.py:847> exception=OSError(99, "error while attempting to bind on address ('::1', 8000, 0, 0): cannot assign requested address")>)
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
    future.result()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/show.py", line 858, in run
    await site.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/srv/conda/envs/notebook/lib/python3.8/asyncio/base_events.py", line 1463, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 99] error while attempting to bind on address ('::1', 8000, 0, 0): cannot assign requested address

It looks like it is failing to bind to the address ::1 (which looks like it is the IPv6 version of 127.0.0.1). Following the stackstac code, I found https://github.com/gjoseph92/stackstac/blob/main/stackstac/show.py#L857 where the server is started on localhost, which is presumably converted to ::1 somewhere inside the server hosting code.

I found a few other issues on Github that seem to be very similar, for other applications - eg. here. This links to an aiohttp issue here which I can't quite work out if it has been fixed or not.

Anyway, the solution suggested on these issues seems to be forcing the binding address to be either 0.0.0.0 or 127.0.0.1. Changing localhost to either of these at https://github.com/gjoseph92/stackstac/blob/main/stackstac/show.py#L857 - though I must admit I can't really understand why localhost doesn't work.

Do you have any deeper understanding of this? If not, would you be happy with me submitting a PR to change to either 0.0.0.0 or 127.0.0.1?

stackstac.mosaic gives different results with different chunk sizes

I've got a strange issue with stackstac.mosaic, where I seem to get different results depending on the chunksize parameter that I give to the original stackstac.stack call.

I've stacked together a few tiles of elevation data, some of which overlap.

If I stack with a chunksize of 100, and then run mosaic, I get this output:

image

But if I run the same thing with a chunksize of 1000, then I get this output:

image

As you can see, there are some differences. We can show those more clearly if we subtract one from the other - which gives us this:

image

I'm rather confused by what is going on here: I don't think I'm doing something stupid in my code - but I wouldn't expect a change in the chunk size to change the result.

My full notebook is available here, showing exactly what I'm doing. As you can see at the end of the notebook, doing a max over the time dimension works fine.

Stackstac can't be used on Python 3.7, but installs successfully

I was successfully able to install StackSTAC (0.1.1) on a fresh poetry environment using python 3.7.10. However, importing the package on python fails due to an incompatibility with the typing module in this python version.

import stackstac

returns:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jovyan/.cache/pypoetry/virtualenvs/test-Qq_P6NR6-py3.7/lib/python3.7/site-packages/stackstac/__init__.py", line 2, in <module>
    from .rio_reader import DEFAULT_GDAL_ENV, MULTITHREADED_DRIVER_ALLOWLIST
  File "/home/jovyan/.cache/pypoetry/virtualenvs/test-Qq_P6NR6-py3.7/lib/python3.7/site-packages/stackstac/rio_reader.py", line 7, in <module>
    from typing import TYPE_CHECKING, Optional, Protocol, Tuple, Type, Union
ImportError: cannot import name 'Protocol' from 'typing' (/opt/conda/lib/python3.7/typing.py)

Use `data_type` and `nodata` from `raster` extension if present

Currently the default dtype is float64 and the default fill_value is np.nan.

The raster extension defines fields for specifying the dtype (data_type) and fill value (no_data). It might be a good idea to change the default to using values from raster:band, falling back to float64 / np.nan if the raster extension isn't present.

Use item assets from the base collection when available

The item assets STAC extension allows collections to describe which assets might be available, and give some metadata common to all instances of those assets—for example, ["eo:bands"]["common_name"]. When this extension is present, if a field is missing on an individual asset, you're supposed to fall back on the field as defined on the item_assets of the parent collection.

This would be particularly handy with #3. Some collections (like Sentinel-2 on AWS) don't provide eo:bands on individual items, just at the collection level, so we're missing some metadata right now.

This is easy to implement; the most annoying part is just handling the "get the parent collection" part across satstac vs pystac vs a plain dict.

Support relative asset links

I'm sure we don't handle it correctly right now. This will be necessary to work with any self-contained catalog or relative published catalog, which could be particularly useful if working with local data.

There's perhaps a larger question here about whether we want to use PySTAC internally instead of raw JSON, since it would handle this (and other things, like item assets #4) for us.

xarray and rioxarray compatibility

xarray 0.18 and rioxarray 0.4 were just released http://xarray.pydata.org/en/stable/whats-new.html#v0-18-0-6-may-2021 , and they included some important changes with backend configuration. Specifically, you can now load xarray datasets via rioxarray.open_rasterio() like so:

xds = xarray.open_dataset("my.tif", engine="rasterio")

I actually haven't looked closely at the stackstac code yet to know how the dataset opening logic currently compares in these libraries... but it would be great to add some docs on integrating with rioxarray. One nice use-case is saving a small netcdf timeseries subset for using locally in QGIS.

Also, I'm wondering if it's possible to relax (^0) or bump (^0.18.0) the current xarray pin for the next release? Could add some matrix tests against xarray master to ensure things don't break since currently it does seem like minor bumps in xarray are more like major bumps ;) (#26 ).

xarray = "^0.17.0"

occasional lockups during dask reads

It seems that stackstac will occasionally hang indefinitely while doing a dataset read:
image

call stack:

File "/srv/conda/envs/notebook/lib/python3.8/threading.py", line 890, in _bootstrap self._bootstrap_inner()

File "/srv/conda/envs/notebook/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run()

File "/srv/conda/envs/notebook/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs)

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/threadpoolexecutor.py", line 55, in _worker task.run()

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/_concurrent_futures_thread.py", line 66, in run result = self.fn(*self.args, **self.kwargs)

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 3616, in apply_function result = function(*args, **kwargs)

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 3509, in execute_task return func(*map(execute_task, args))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 3509, in execute_task return func(*map(execute_task, args))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 3509, in execute_task return func(*map(execute_task, args))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 3509, in execute_task return func(*map(execute_task, args))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/dask/optimization.py", line 963, in __call__ return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/dask/core.py", line 151, in get result = _execute_task(task, cache)

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task return func(*(_execute_task(a, cache) for a in args))

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/to_dask.py", line 172, in fetch_raster_window data = reader.read(current_window)

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/rio_reader.py", line 425, in read result = reader.read(

File "/srv/conda/envs/notebook/lib/python3.8/site-packages/stackstac/rio_reader.py", line 249, in read return self.dataset.read(1, window=window, **kwargs)

Is it possible to pass in a timeout parameter or something like that or would I be better off just cancelling the job entirely when something like this happens?

Puzzling behavior of accumulate_metadata (None and NaTs in coordinates)

I'm trying out stackstac with this valid STAC 1.0 static catalog here:
https://github.com/scottyhq/sentinel1-rtc-stac/tree/main/12SYJ

The following appears to work but is a bit slow (running in same region as data AWS us-west-2), and the VisibleDeprecationWarning caught my eye.

import pystac
import stackstac
cat = pystac.read_file('https://raw.githubusercontent.com/scottyhq/sentinel1-rtc-stac/main/12SYJ/catalog.json')

%%time 
da = stackstac.stack(cat,
                     assets=['gamma0_vv'])
da
# CPU times: user 188 ms, sys: 12.5 ms, total: 201 ms
# Wall time: 3.5 s
# stackstac/accumulate_metadata.py:151: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences

Inspecting the time coordinate we get a NaT, which I've figure out happens when only one of the Items having fractional seconds in datetime, but the others do not.

da.time
"""
array(['2016-11-21T01:09:37.000000000', '2017-01-08T01:09:43.000000000',
       '2017-02-07T01:10:20.000000000', '2017-02-14T01:02:16.000000000',
       '2018-01-03T01:09:50.000000000',                           'NaT',
       '2018-01-15T01:09:50.000000000', '2019-01-10T01:09:57.000000000',
       '2019-01-22T01:09:56.000000000', '2019-02-03T01:09:56.000000000',
       '2020-08-31T13:18:20.000000000', '2020-09-07T13:10:16.000000000',
       '2020-09-12T13:18:20.000000000', '2021-01-05T13:10:15.000000000',
       '2021-01-10T13:18:18.000000000', '2021-01-17T13:10:14.000000000'],
"""

The 'NaT' disappears if you first normalize the timestamps:

for item in cat.get_all_items():
    item.datetime = item.datetime.replace(microsecond=0)
    print(item.datetime)

But there are other 'None' values showing up in coordinates that have sets of unique values. I don't see any obvious issues in the metadata (which passes validation), so I'm suspecting a bug in stacstack based on the VisibleDeprecationWarning. See some of the other coordinates below with None values. Nones show up at the front of the coords and are not restricted to specific items...

da['sat:orbit_state']
#array([None, None, None, None, None, None, None, None, None, 'ascending', 'descending', 'descending', 'descending', 'descending', 'descending', 'descending'], dtype=object)

da['platform']
#array([None, 'sentinel-1b', 'sentinel-1a', 'sentinel-1a', 'sentinel-1b','sentinel-1b', 'sentinel-1b', 'sentinel-1b', 'sentinel-1b','sentinel-1b', 'sentinel-1a', 'sentinel-1a', 'sentinel-1a','sentinel-1a', 'sentinel-1a', 'sentinel-1a'], dtype=object)

da['sat:relative_orbit']
#array([None, None, 49, 151, 49, 151, 49, 49, 49, 49, 129, 56, 129, 56, 129, 56], dtype=object)

Coordinate computation does not take into accound inverted axis in `center` mode

To be compatible with rioxarray one needs to use stackstac.stack(..., xy_coords="center") when computing X/Y coordinate values. When using this mode on data that contains "inverted Y axis", a most common scenario, Y axis coordinates are offset by 1 pixel size into positive direction.

I have made a small reproducer. Data is a global synthetic image with 1 degree per pixel in EPSG:4326, when loading it with xy_coords="center" you would expect Y coordinate to span from -89.5 to 89.5, but instead it goes from 90.5 to -88.5.

https://nbviewer.org/gist/Kirill888/b3dad8afdc10b37cd21af4aea8f417e3/stackstac-xy_coords-error-report.ipynb
https://gist.github.com/Kirill888/b3dad8afdc10b37cd21af4aea8f417e3

This causes issue reported earlier here: #68

Code that performs computation of the coordinate just offsets "top-left" coordinate by positive half pixel size, but instead should offset by sign(coord[1] - coord[0])*abs(resolution)*0.5

fetch_raster_window should use `fill_value`

At

return np.broadcast_to(np.nan, (1, 1) + windows.shape(current_window))
, fetch_raster_window will fill with NaN, which can result in data.dtype not matching data.compute().dtype. It should use the user-provided fill_value instead.

A non-minimal reproducer, from microsoft/PlanetaryComputer#17

import numpy as np
import xarray as xr
import rasterio.features
import stackstac
import pystac_client
import pystac
import planetary_computer
from rasterio.enums import Resampling
import rioxarray
import dask
from dask_gateway import GatewayCluster
catalog = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")

aoi = {
    "type": "Polygon",
    "coordinates": [
        [
            [-100, 30],
            [-100, 66],
            [-47, 66],
            [-47, 30],
            [-100, 30]
        ]
    ],
}

search = catalog.search(
    #collections=["cop-dem-glo-90"], intersects=aoi
    collections=["alos-dem"], intersects=aoi
)
items = list(search.get_items())
signed_items = [planetary_computer.sign(item).to_dict() for item in items]

a = stackstac.stack(
        signed_items,
        assets=["data"],  
        chunksize=512,
        resolution=1000,
        epsg=32198,
        resampling=Resampling.average,
        dtype="int16",
        fill_value=0,
        # bounds=[-2009488, -715776, 1401061, 2597757],
      ).drop('band').squeeze()
# mosaic = stackstac.mosaic(data)
# mosaic
assert a.dtype == a[0, :10, :10].compute().dtype

I'll make a PR.

TypeError: Unrecognized STAC collection type <class 'pystac.item_collection.ItemCollection'>

I am trying to use stackstac ( Windows, virtual environment) but getting this error. Also I tried using in google colab but when I do pip install stackstac,older version 0.1.1 is installed.
Error on windows:

TypeError                                 Traceback (most recent call last)
File <timed exec>:1, in <module>

File C:\ProgramData\Anaconda3\envs\a2\lib\site-packages\stackstac\stack.py:278, in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, errors_as_nodata, reader)
     20 def stack(
     21     items: Union[ItemCollectionIsh, ItemIsh],
     22     assets: Optional[Union[List[str], AbstractSet[str]]] = frozenset(
   (...)
     43     reader: Type[Reader] = AutoParallelRioReader,
     44 ) -> xr.DataArray:
     45     """
     46     Create an `xarray.DataArray` of all the STAC items, reprojected to the same grid and stacked by time.
     47 
   (...)
    276         automatically computed from the items you pass in.
    277     """
--> 278     plain_items = items_to_plain(items)
    280     if sortby_date is not False:
    281         plain_items = sorted(
    282             plain_items,
    283             key=lambda item: item["properties"].get("datetime", "") or "",
    284             reverse=sortby_date == "desc",
    285         )

File C:\ProgramData\Anaconda3\envs\a2\lib\site-packages\stackstac\stac_types.py:163, in items_to_plain(items)
    160 if isinstance(items, PystacItemCollection):
    161     return [item.to_dict() for item in items]
--> 163 raise TypeError(f"Unrecognized STAC collection type {type(items)}: {items!r}")

TypeError: Unrecognized STAC collection type <class 'pystac.item_collection.ItemCollection'>: <pystac.item_collection.ItemCollection object at 0x0000028DF16140D0>`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.