Code Monkey home page Code Monkey logo

pystac's People

Contributors

bobinmathew avatar chelm avatar chuckwondo avatar dependabot[bot] avatar duckontheweb avatar emmanuelmathot avatar fnattino avatar gadomski avatar hectcastro avatar ircwaves avatar john-dupuy avatar jpolchlo avatar jsignell avatar kylebarron avatar l0b0 avatar lossyrob avatar m-mohr avatar martinfleis avatar matthewhanson avatar philvarner avatar pjhartzell avatar richardscottoz avatar schwehr avatar sgillies avatar simonkassel avatar thomas-maschler avatar tschaub avatar tyler-c2s avatar volaya avatar whatnick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pystac's Issues

Infinite recursion with `get_all_items`

When reading s3://rasterfoundry-development-data-us-east-1/berlin-catalog/catalog.json, initially, the reported links are correct:

> from pystac import Catalog
> catalog = Catalog.from_file("s3://rasterfoundry-development-data-us-east-1/berlin-catalog/catalog.json")
> catalog.links

[<Link rel=child target=./collection.json>,
 <Link rel=self target=s3://rasterfoundry-development-data-us-east-1/berlin-catalog/catalog.json>,
 <Link rel=root target=<Catalog id=berlin>>]

However, iterating over get_all_items throws a RecursionError: maximum recursion depth exceeded in comparison, and inspecting the links after the fact reveals that the catalog has picked up a child link to itself:

> catalog.links
[<Link rel=child target=<Catalog id=berlin>>,
 <Link rel=self target=s3://rasterfoundry-development-data-us-east-1/berlin-catalog/catalog.json>,
 <Link rel=parent target=<Catalog id=berlin>>,
 <Link rel=root target=<Catalog id=berlin>>]

Discussion: is PyStac useful for creating offline STAC Catalogs?

Hi, I was looking for a support forum but couldn't find one. I hope this is an okay place to ask for support. If not, could someone direct me to the appropriate place?

I've got a set of GeoTIFFs in a Google bucket that I will be adding to over time. I'd like to use STAC to organize them in a standard way for listing and searching. What I'd like to do is be able to create a local catalog then upload it to a public accessible bucket. I followed the tutorial here: https://pystac.readthedocs.io/en/latest/tutorials/how-to-create-stac-catalogs.html and got what I could from the docs.

I made a tracer script to see if this would do what I wanted to do: https://gist.github.com/richpsharp/77e390901c6c266c00f5c2333a835e48, but when I examine my catalog.json there are a lot of Windows path slashes. Here's a snippet:

{
    "id": "mygsstac",
    "stac_version": "0.8.1",
    "description": "mygsstac",
    "links": [
        {
            "rel": "root",
            "href": ".\\catalog.json",
            "type": "application/json"
        },
        {
            "rel": "item",
            "href": ".\\raster_a\\raster_a.json",
            "type": "application/json"
        },
...

I was hoping to see paths that would be compatible with URLs or perhaps I could set a "root" which all the other hrefs would be built locally from. I'm worried that I don't understand how to use STAC or PySTAC and what I'm doing is a little skewed. Either that or I'm missing a big "set base url" switch. Thank you for any help!

Update PySTAC to use STAC 0.9.0

  • Update schemas to 0.9.0 versions and fix so that tests pass.

Changes to account for (from CHANGELOG)

Crossed out items did not require changes in PySTAC.

Added

  • data role, as a suggestion for a common role for data files to be used in case data providers don't come up with their own names and semantics

  • ItemCollection requires stac_version field, stac_extensions has also been added

  • A description field has been added to Item assets (also Asset definitions extension)

  • Field mission to Common Metadata fields

  • Extensions:

  • STAC API:

    • Added the Item and Collection API Version extension to support versioning in the API specification
    • Run npm run serve or npm run serve-ext to quickly render development versions of the OpenAPI spec in the browser
  • Basics added to Common Metadata definitions with new description field for
    Item properties

  • New fields to the link object to facilitate pagination support for POST requests (STAC API)

  • Clarification text on HTTP verbs in STAC API (STAC API)

Changed

  • Collection field property and the merge ability moved to a new extension 'Commons'

  • Collection summaries merge array fields now

  • Moved angle definitions from extensions eo and new view extension

    • eo:off_nadir -> view:off_nadir
    • eo:azimuth -> view:azimuth
    • eo:incidence_angle -> view:incidence_angle
    • eo:sun_azimuth -> view:sun_azimuth
    • eo:sun_elevation -> view:sun_elevation
  • Support for CommonMark 0.29 instead of CommonMark 0.28

  • Added attribute roles to Item assets (also Asset definitions extension), to be used similarly to Link rel

  • Updated API yaml to clarify bbox filter should be implemented without brackets. Example: bbox=160.6,-55.95,-170,-25.89

  • Several fields have been moved from extensions or item fields to the Common Metadata fields:

    • eo:platform / sar:platform => platform
    • eo:instrument / sar:instrument => instruments, also changed from string to array of strings
    • eo:constellation / sar:constellation => constellation
    • dtr:start_datetime => start_datetime
    • dtr:end_datetime => end_datetime
  • Extensions:

    • Data Cube extension: Changed allowed formats (removed PROJ string, added PROJJSON / WKT2) for reference systems
    • Checksum extension is now using self-identifiable hashes (Multihash)
    • Changed sar:type to sar:product_type and sar:polarization to sar:polarizations in the SAR extension
  • STAC API:

    • The endpoint /stac has been merged with /
    • The endpoint /stac/search is now called /search
    • Sort Extension - added non-JSON query/form parameter format
    • Fields extension has a simplified format for GET parameters
    • search extension renamed to context extension. JSON object renamed from search:metadata to context
    • Removed "next" from the search metadata and query parameter, added POST body and headers to the links for paging support
    • Query Extension - type restrictions on query predicates are more accurate, which may require additional implementation support
  • Item title definition moved from core Item fields to Common Metadata Basics
    fields. No change is required for STAC Items.

  • putFeature can return a PreconditionFailed to provide more explicit information when the resource has changed in the server

  • Sort extension now uses "+" and "-" prefixes for GET requests to denote sort order.

  • Clarified how /search links must be added to / and changed that links to both GET and POST must be provided now that the method can be specified in links

Removed

  • version field in STAC Collections. Use Version Extension instead
  • summaries field from Catalogs. Use Collections instead
  • Asset Types (pre-defined values for the keys of individual assets, not media types) in Items. Use the asset's roles instead
  • license field doesn't allow SPDX expressions any longer. Use various and links instead
  • Extensions:
    • eo:platform, eo:instrument, eo:constellation from EO extension, and sar:platform, sar:instrument, sar:constellation from the SAR extension
    • Removed from EO extension field eo:epsg in favor of proj:epsg
    • gsd and accuracy from eo:bands in the EO extension
    • sar:absolute_orbit and sar:center_wavelength fields from the SAR extension
    • data_type and unit from the sar:bands object in the SAR extension
    • Datetime Range (dtr) extension. Use the Common Metadata fields instead
  • STAC API:
    • next from the search metadata and query parameter
  • In API, removed any mention of using media type multipart/form-data and x-www-form-urlencoded

Fixed

  • The license field in Item and Collection spec explicitly mentions that the value proprietary without a link means that the data is private
  • Clarified how to fill stac_extensions
  • More clarifications; typos fixed
  • Fixed Item JSON Schema now allOf optional Common Metadata properties are evaluated
  • Clarified usage of optional Common Metadata fields for STAC Items
  • Clarified usage of paging options, especially in relation to what OGC API - Features offers
  • Allow Commonmark in asset description, as it's allowed everywhere else
  • Put asset description in the API
  • Fixed API spec regarding license expressions
  • Added missing schema in the API Version extension
  • Fixed links in the Landsat example in the collection-spec

Improve Collection <-> Item property inheritance

Currently, reading from a Collection with common Item properties will work, e.g. if an Item lists eo in it's stac_extensions properties, but all eo:* properties are on the Collection, PySTAC will read it fine and it will produce EOItems.

However, on the write side, there's not a good mechanism to merge common Item properties into the collection.

This issue is for figuring out the best approach and API for collapsing Item properties for writing.

Wrong absolute path while listing catalog

Hi,

I have noticed a problem in absolute HREF construction. When I try to read a catalog from file using the relative path, the error occurs:

[Errno 22] Invalid argument: 'c://C:\Users\*****'

I checked the source files and it seems that the problem is in the utils.py in make_absolute_href() function.

abs_path = os.path.abspath(os.path.join(start_dir, parsed_source.path))

abs_path already includes C:\ , so there is no need to build:
'{}://{}{}'.format(parsed_start.scheme, parsed_start.netloc, abs_path)

If I comment the following:

if parsed_start.scheme != '':
return '{}://{}{}'.format(parsed_start.scheme,
parsed_start.netloc, abs_path)

everything works fine for me!

Has anybody faced the similar problem?

Cloning catalog duplicates root link

When cloning a catalog, you end up with two root links:

In [5]: catalog = Catalog.from_file("/opt/data/scene-catalog/catalog.json")

In [6]: cloned = catalog.clone()

In [7]: cloned.links
Out[7]: 
[<Link rel=root target=<Catalog id=ABSOLUTE_PUBLISHED>>,
 <Link rel=self target=/opt/data/scene-catalog/catalog.json>,
 <Link rel=child target=/opt/data/scene-catalog/c8d68bc1-a862-4d7f-96d3-5af2436695a4/collection.json>,
 <Link rel=root target=<Catalog id=ABSOLUTE_PUBLISHED>>]

This is bad because it breaks assumptions about what get_root_link should return -- or at least it appears to, if I'm reading the next() call correctly as expecting only one result. I think this is also responsible for some surprising behaviors with normalization when building up a catalog but I can't prove that part.

Datetime format doesn't work well with libraries expecting ISO8601 formatted strings

While the STAC spec specifies that datetimes should be formatted according to RFC 3339, section 5.6, the space separator between the date and time causes some issues with libraries expecting an ISO8601 formatted string. Changing the separator to 'T' would make the datetime compliant with both RFC 3339, section 5.6 as well as ISO8601.

For example, currently a datetime is formatted like this 2019-01-01 00:00:00Z and the updated datetime string would be formatted like this 2019-01-01T00:00:00Z.

CI badge shows status of any CI build instead of just current develop CI status

The CI workflow doesn't differentiate between builds of a PR or the develop branch. This causes the CI badge in the README to show failures when the job has failed on PR builds, which gives the appearance that the develop branch has broken the build.

The documentation for workflows says that a branch parameter to the badge might work, e.g.

https://github.com/azavea/pystac/workflows/CI/badge.svg?branch=develop

though currently that results in a no-status badge:

Store items in non-canonical path

I want to create a self contained catalog that uses relative links, but overrides that items should be stored in subdirectories of their parent catalog. e.g I want

/tmp/test-catalog/test_dataset_1/BD44_500_031096.json

rather than

/tmp/test-catalog/test_dataset_1/BD44_500_031096/BD44_500_031096.json

Is this possible? I tried setting the href paths manually with the set_self_href method on all STAC objects and not calling normalize_hrefs, but that didn't work. See example here: https://gist.github.com/palmerj/a78dc0b99da0720266ac19c736785802

Develop method for identifying STAC object type and version number

Develop a method that takes in a dict, and returns a tuple (object_type, version, [extensions]).

This will be used to identify the STAC objects that are contained in dictionaries. This will be useful for two things:

  • Validation: stac-validator and other validation tools - including one we are building to validate all examples in the stac-spec as part of radiantearth/stac-spec#623 - can use this method to determine what schema(s) to apply to the validation process.

  • Deserialization in PySTAC for reading older version of STAC - a proposed solution to #36 could include a JSON transformation from older versions of stac objects to the version that the to_dict on PySTAC objects requires. In order to write this in a sane way, it would be good to know a priori what the STAC object, version, and extensions we are working with.

This method is probably not going to be very pretty - it's hard to imagine a version that is a set of conditionals that looks like when you take headphones out of a messy drawer. The idea is, if we put this method in PySTAC, we can centralize that messy logic to one place so that no duplicates have to exist. This follows what I call the One Monster Rule: if you're going to create a monster, make sure it's only created once.

Another approach we tries was to use jsonschema with a large combined oneOf reference that included all schemas from the STAC repo - jsonschema will try to validate each of the schemas to determine what works. However, if multiple schemas pass validation, this throws an error, which makes a lot of sense.

Any suggestions on how to better solve this are welcome!

/cc Team Validation @ the Arlington STAC Sprint @jbants @anayeaye

Catalog from_file showing not working

source code:

from pystac import Catalog
cat = Catalog.from_file('https://sentinel-stac.s3.amazonaws.com/catalog.json'))

showing error:
Traceback (most recent call last):
File "source/main.py", line 5, in
cat = Catalog.from_file('https://sentinel-stac.s3.amazonaws.com/catalog.json')
File "/home/nawaz/Documents/disaster_managment/venv/lib/python3.6/site-packages/pystac/stac_object.py", line 408, in from_file
d = STAC_IO.read_json(href)
File "/home/nawaz/Documents/disaster_managment/venv/lib/python3.6/site-packages/pystac/stac_io.py", line 99, in read_json
return json.loads(STAC_IO.read_text(uri))
File "/home/nawaz/Documents/disaster_managment/venv/lib/python3.6/site-packages/pystac/stac_io.py", line 64, in read_text
return cls.read_text_method(uri)
File "/home/nawaz/Documents/disaster_managment/venv/lib/python3.6/site-packages/pystac/stac_io.py", line 14, in default_read_text_method
with open(uri) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'https://sentinel-stac.s3.amazonaws.com/catalog.json'

LabelItems require too much state tracking from users

Problem Description

As currently implemented, it's challenging to find a place to put a LabelItem's asset. The workflow this prevents is creating a bunch of LabelItems in memory, creating a Collection in memory, then adding that Collection and those LabelItems later to a Catalog, which is only at the very end written to disk. Having nowhere to store the data while building the object up in memory makes fulfilling the following requirement from the STAC spec difficult:

The Label Extension requires at least one asset that uses the key "labels". The asset will contain a link to the actual label data.

I don't know where it's going to go right now! ๐Ÿค” And I don't think it's great ergonomically to require a bunch of filesystem manipulation/writing to disk to build up a STAC. Any failures midway through the process then leave halfway complete catalogs lying around, which adds to my manual cleanup burden.

Proposed solution

It would be nice to shift IO to the end of the world to the extent possible. One way to do this would be to initialize LabelItems with a FeatureCollection that gets written out adjacent to the label item (and assets get updated) whenever the LabelItem is saved.

Add mechanism to allow for some backwards compatibility

As STAC is still in flux, property names, etc might shift. Until things stabilize, it would be nice to have a consistent way to read STAC that is a little bit off, i.e. a version or so back.

One option is to have to_dict methods simply push the dict through a transformation that accounts for any old versions. We're already doing some of that in the codebase, e.g. https://github.com/azavea/pystac/blob/v0.3.0/pystac/collection.py#L206 . It would be good to formalize this and have a place to collect these types of updating transformations.

Proprietary collections can be written without license link

Problem Description

See catalog in s3://demo-mlhub-earth/panama-water-features-jan-apr-2019/ -- the collections have a proprietary license, but there's no license link in links.

Expected Behavior or Output

pystac should get mad at me when I try to write an invalid catalog

Improve documentation on STAC_IO usage

Pydocs for STAC_IO should make it more clear on how to override the read_text_method etc in order to work with user-defined IO methods. Also, add sphinx docs on the subject.

Improve API around licenses and providers

As of 0.8.0, Items can have licenses and providers in their properties. Currently we only have Collections associated with Provider objects. We should develop API to handle these optional fields.

Also, there should be some API to handle license links on Collections and Items, so that if these objects do point to a license, it can be handled via the API.

Allow setting relative paths with links and assets.

Currently, we can read STAC's that have relative links, but can only write STACs with absolute links.

Implement a method on catalog to set uris to relative. Default behavior should follow the best-practices guidelines (i.e. self-contained catalogs with relative paths should not contain 'self' links).

Saving part of a catalog

I've created an application that loads and saves the a static STAC catalog to S3 using something like:

def my_read_method(uri):
            parsed = urlparse(uri)
            if parsed.scheme == "s3":
                bucket = parsed.netloc
                key = parsed.path[1:]
                s3 = _get_s3_resource()
                try:
                    logger.debug("Reading %s", key)
                    obj = s3.Object(bucket, key)
                    txt = obj.get()["Body"].read().decode("utf-8")
                except ClientError as e:
                    raise StacCatalogError(str(e) + " " + key)
                return txt
            else:
                return stac.STAC_IO.default_read_text_method(uri)

        def my_write_method(uri, txt):
            parsed = urlparse(uri)
            if parsed.scheme == "s3":
                bucket = parsed.netloc
                key = parsed.path[1:]
                s3 = _get_s3_resource()
                try:
                    logger.debug("Writing %s", key)
                    s3.Object(bucket, key).put(Body=txt)
                except ClientError as e:
                    raise StacCatalogError(str(e) + " " + key)
            else:
                stac.STAC_IO.default_write_text_method(uri, txt)

        stac.STAC_IO.read_text_method = my_read_method
        stac.STAC_IO.write_text_method = my_write_method

I've noticed that once I start adding multiple collections to the catalog PySTAC will load all catalog collections and their child items even if I'm not iterating through everything. This is slow when saving objects to AWS S3. To make matters worse when I save a new collection (calling sub_catalogue.save on the parent non-root catalog), it will re-save all collection siblings and their child items. Some of my collections will end up having 1000s of items. I've read to the docs and can't see how to use the API to save part of the catalog or avoid saving catalog items that have not changed. Is this possible?

Thanks heaps in advance.

Discussion: what should happen when self-contained STACs are copied?

I have a STAC that includes a bunch of chips of tifs. When I clone that STAC, I keep all of the items, and the assets point to the old tifs. I don't think this is necessarily wrong. I'm curious whether it's a deliberate choice to leave the references to the old tifs and not copy the tifs into the new stac or whether that's something that happened incidentally. I can see arguments for both ways --

In favor of not copying the data:

  • since the tifs are part of a different catalog, there's no relative path from the new catalog to the old data, so path construction requires some assumptions on PySTAC's part
  • presumably if I'm building stacs from other stacs i have access to the data in both places, so why copy?

In favor of being able within PySTAC to copy the data (obviously I can do whatever I want outside of PySTAC):

  • self-contained catalogs are nice, and there's currently no way to tell PySTAC to make a new self-contained catalog from an existing one as far as I can tell (it won't infer the copy behavior)
  • in multi-step pipelines for STAC production, I might want to delete everything but the output of the last step (i.e. only keep the "complete" catalog, where "complete" means "has had everything I want to do to it done), which means at the end my references to assets from previous stages will be invalid

Error when saving and then loading Catalog

Using PySTAC 0.3.3, the following code executed as a script works as expected (it saves a STAC and then opens it and prints the items). But if you uncomment the code in Block A, Line B will fail with the following stack trace. This especially strange because all that Block A does is read from the catalog before saving it. This came up in the context of trying to modify some links to GeoTIFF files in an existing stack, and then saving the modified copy. I found that the following was the minimal example that replicates the error.

from urllib.parse import urlparse
from os.path import join

import boto3
from pystac import STAC_IO
from pystac import Catalog, LabelItem, CatalogType

# Copied from PyStac tutorial.
def setup_stac_s3():
    def my_read_method(uri):
        parsed = urlparse(uri)
        if parsed.scheme == 's3':
            bucket = parsed.netloc
            key = parsed.path[1:]
            s3 = boto3.resource('s3')
            obj = s3.Object(bucket, key)
            return obj.get()['Body'].read().decode('utf-8')
        else:
            return STAC_IO.default_read_text_method(uri)

    def my_write_method(uri, txt):
        parsed = urlparse(uri)
        if parsed.scheme == 's3':
            bucket = parsed.netloc
            key = parsed.path[1:]
            s3 = boto3.resource("s3")
            s3.Object(bucket, key).put(Body=txt)
        else:
            STAC_IO.default_write_text_method(uri, txt)

    STAC_IO.read_text_method = my_read_method
    STAC_IO.write_text_method = my_write_method

# Open a catalog and then save a copy locally.
setup_stac_s3()
stac_uri = ('s3://rasterfoundry-production-data-us-east-1/stac-exports/'
            'fd478c2b-3f71-41e4-a87b-e97a8a0d0afa/catalog.json')
cat = Catalog.from_file(stac_uri)
'''
# Block A. If this is uncommented, then Line B will fail.
for item in cat.get_all_items():
    print(item)
'''
new_stac_uri = '/opt/data/foo/test-stac-catalog'
cat.normalize_hrefs(new_stac_uri)
cat.save(catalog_type=CatalogType.SELF_CONTAINED)

# Open the local copy and iterate over it.
new_stac_uri = '/opt/data/foo/test-stac-catalog'
cat = Catalog.from_file(join(new_stac_uri, 'catalog.json'))
# Line B
for item in cat.get_all_items():
    print(item)

Stack trace:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/src/foo/pystac_test.py", line 51, in <module>
    for item in cat.get_all_items():
  File "/opt/conda/lib/python3.6/site-packages/pystac/catalog.py", line 281, in get_all_items
    yield from child.get_all_items()
  File "/opt/conda/lib/python3.6/site-packages/pystac/catalog.py", line 281, in get_all_items
    yield from child.get_all_items()
  File "/opt/conda/lib/python3.6/site-packages/pystac/catalog.py", line 279, in get_all_items
    yield from self.get_items()
  File "/opt/conda/lib/python3.6/site-packages/pystac/stac_object.py", line 260, in get_stac_objects
    link.resolve_stac_object(root=self.get_root())
  File "/opt/conda/lib/python3.6/site-packages/pystac/link.py", line 160, in resolve_stac_object
    obj = STAC_IO.read_stac_object(target_href, root=root)
  File "/opt/conda/lib/python3.6/site-packages/pystac/stac_io.py", line 119, in read_stac_object
    return cls.stac_object_from_dict(d, href=uri, root=root)
  File "/opt/conda/lib/python3.6/site-packages/pystac/serialization/__init__.py", line 31, in stac_object_from_dict
    merge_common_properties(d, json_href=href, collection_cache=collection_cache)
  File "/opt/conda/lib/python3.6/site-packages/pystac/serialization/common_properties.py", line 47, in merge_common_properties
    collection = STAC_IO.read_json(collection_href)
  File "/opt/conda/lib/python3.6/site-packages/pystac/stac_io.py", line 96, in read_json
    return json.loads(STAC_IO.read_text(uri))
  File "/opt/conda/lib/python3.6/site-packages/pystac/stac_io.py", line 61, in read_text
    return cls.read_text_method(uri)
  File "/opt/src/foo/pystac_test.py", line 19, in my_read_method
    return STAC_IO.default_read_text_method(uri)
  File "/opt/conda/lib/python3.6/site-packages/pystac/stac_io.py", line 12, in default_read_text_method
    with open(uri) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/data/foo/test-stac-catalog/12cb17a8-ae08-469c-a2be-4d0619240014/400c22e3-5b54-438b-b600-5e9bd6d0a498/3c67b59c-2e6f-47fb-ba3c-0dd106941096/collection.json'

PySTAC reading collection redundantly

Currently PySTAC reads the collection of an item over and over, even if it's already read the collection - there is no cache support that gets triggered. This can slow reading STACs from cloud providers down considerably.

Perhaps implement a caching strategy that caches on HREF instead of just ID.

Efficiently fetching a specific child object

I'm having trouble reducing the number of s3 calls in my application with a catalog containing many 1000's of child objects. If I call catalog.get_child(id, recursive=False) it will iterate through the child objects in sequence resolving each of them until it finds object it needs. Given that the child links don't contain the ID of the referenced object this is understandable. However, the child URLs from a STAC catalog published using best practises do contain the ID. Would it be possible to somehow add an optimisation to pystac to use the URL to short circuit the lookup process? I guess it could fall back to traversing the child objects if the URL extract of the ID doesn't work and the returned object ID doesn't match. I'm also wondering if this an issue with the STAC specification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.