Code Monkey home page Code Monkey logo

datacube-alchemist's Introduction

Datacube Alchemist - ODC Dataset to Dataset Converter

Scan Test Push codecov

Purpose

Datacube Alchemist is a command line application for performing Dataset to Dataset transformations in the context of an Open Data Cube system.

It uses a configuration file which specifies an input Product or Products, a Transformation to perform, and output parameters and destination.

Features

  • Writes output to Cloud Optimised GeoTIFFs
  • Easily runs within a Docker Container
  • Parallelism using AWS SQS queues and Kubernetes
  • Write output data to S3 or a file system
  • Generates eo3 format dataset metadata, along with processing information
  • Generates STAC 1.0.0.beta2 dataset metadata
  • Configurable thumbnail generation
  • Pass any command line options as Environment Variables

Installation

You can build the docker image locally with Docker or Docker Compose. The commands are docker build --tag opendatacube/datacube-alchemist . or docker-compose build.

There's a Python setup file, so you can do pip3 install . in the root folder. You will need to ensure that the Open Data Cube and all its dependencies happily install though.

Usage

Development environment

To run some example processes you can use the Docker Compose file to create a local workspace. To start the workspace and run an example, you can do the following:

  • Export the environment variables ODC_ACCESS_KEY and ODC_SECRET_KEY with valid AWS credentials
  • Run make up or docker-compose up to start the postgres and datacube-alchemist Docker containers
  • make initdb to initialise the ODC database (or see the Makefile for the specific command)
  • make metadata will add the metadata that the Landsat example product needs
  • make product will add the Landsat product definitions
  • make index will index a range of Landsat scenes to test processing with
  • make wofs-one or make fc-one will process a single Fractional Cover or Water Observations from Space scene and output the results to the ./examples folder in this project directory

Production Usage

Datacube Alchemist is used in production by the Digital Earth Australia and Digital Earth Africa programs

Queues

Notes on queues. To run jobs from an SQS queue, good practice is to create a deadletter queue as well as a main queue. Jobs (messages) get picked up off the main queue, and if they're successful, then they're deleted. If they aren't successful, they're not deleted, and they go back on the main queue after a defined amount of time. If this happens more than the defined number of times then the message is moved to the deadletter queue. In this way, you can track work completion.

Commands

Note that the --config-file can be a local path or a URI.

datacube-alchemist run-one

Usage: datacube-alchemist run-one [OPTIONS]

  Run with the config file for one input_dataset (by UUID)

Options:
  -c, --config-file TEXT  The path (URI or file) to a config file to use for the
                          job  [required]
  -u, --uuid TEXT         UUID of the scene to be processed  [required]
  --dryrun, --no-dryrun   Don't actually do real work
  --help                  Show this message and exit.

Note that --dryrun is optional, and will run a 1/10 scale load and will not write output to the final destination.

datacube-alchemist run-one \
  --config-file ./examples/c3_config_wo.yaml \
  --uuid 7b9553d4-3367-43fe-8e6f-b45999c5ada6 \
  --dryrun \

datacube-alchemist run-many

Note that the final argument is a datacube expression , see Datacube Search documentation.

Usage: datacube-alchemist run-many [OPTIONS] [EXPRESSIONS]...

  Run Alchemist with the config file on all the Datasets matching an ODC query
  expression

  EXPRESSIONS

  Select datasets using [EXPRESSIONS] to filter by date, product type, spatial
  extents or other searchable fields.

      FIELD = VALUE
      FIELD in DATE-RANGE
      FIELD in [START, END]
      TIME < DATE
      TIME > DATE

  START and END can be either numbers or dates
  Dates follow YYYY, YYYY-MM, or YYYY-MM-DD format

  FIELD: x, y, lat, lon, time, product, ...

  eg. 'time in [1996-01-01, 1996-12-31]'
      'time in 1996'
      'time > 2020-01'
      'lon in [130, 140]' 'lat in [-40, -30]'
      product=ls5_nbar_albers

Options:
  -c, --config-file TEXT  The path (URI or file) to a config file to use for the
                          job  [required]
  -l, --limit INTEGER     For testing, limit the number of tasks to create or
                          process.
  --dryrun, --no-dryrun   Don't actually do real work
  --help                  Show this message and exit.

Example

datacube-alchemist run-many \
  --config-file ./examples/c3_config_wo.yaml \
  --limit=2 \
  --dryrun \
  time in 2020-01

datacube-alchemist run-from-queue

Notes on queues. To run jobs from an SQS queue, good practice is to create a deadletter queue as well as a main queue. Jobs (messages) get picked up off the main queue, and if they're successful, then they're deleted. If they aren't successful, they're not deleted, and they go back on the main queue after a defined amount of time. If this happens more than the defined number of times then the message is moved to the deadletter queue. In this way, you can track work completion.

Usage: datacube-alchemist run-from-queue [OPTIONS]

  Process messages from the given queue

Options:
  -c, --config-file TEXT       The path (URI or file) to a config file to use
                               for the job  [required]
  -q, --queue TEXT             Name of an AWS SQS Message Queue  [required]
  -l, --limit INTEGER          For testing, limit the number of tasks to create
                               or process.
  -s, --queue-timeout INTEGER  The SQS message Visibility Timeout in seconds,
                               default is 600, or 10 minutes.
  --dryrun, --no-dryrun        Don't actually do real work
  --sns-arn TEXT               Publish resulting STAC document to an SNS
  --help                       Show this message and exit.

Example

datacube-alchemist run-from-queue \
  --config-file ./examples/c3_config_wo.yaml \
  --queue example-queue-name \
  --limit=1 \
  --queue-timeout=600 \
  --dryrun

datacube-alchemist add-to-queue

Search for Datasets and enqueue Tasks into an AWS SQS Queue for later processing.

The --limit is the total number of datasets to limit to, whereas the --product-limit is the number of datasets per product, in the case that you have multiple input products.

Usage: datacube-alchemist add-to-queue [OPTIONS] [EXPRESSIONS]...

  Search for Datasets and enqueue Tasks into an AWS SQS Queue for later
  processing.

  EXPRESSIONS

  Select datasets using [EXPRESSIONS] to filter by date, product type, spatial
  extents or other searchable fields.

      FIELD = VALUE
      FIELD in DATE-RANGE
      FIELD in [START, END]
      TIME < DATE
      TIME > DATE

  START and END can be either numbers or dates
  Dates follow YYYY, YYYY-MM, or YYYY-MM-DD format

  FIELD: x, y, lat, lon, time, product, ...

  eg. 'time in [1996-01-01, 1996-12-31]'
      'time in 1996'
      'time > 2020-01'
      'lon in [130, 140]' 'lat in [-40, -30]'
      product=ls5_nbar_albers

Options:
  -c, --config-file TEXT       The path (URI or file) to a config file to use
                               for the job  [required]
  -q, --queue TEXT             Name of an AWS SQS Message Queue  [required]
  -l, --limit INTEGER          For testing, limit the number of tasks to create
                               or process.
  -p, --product-limit INTEGER  For testing, limit the number of datasets per
                               product.
  --dryrun, --no-dryrun        Don't actually do real work
  --help                       Show this message and exit.

Example

datacube-alchemist add-to-queue \
  --config-file ./examples/c3_config_wo.yaml \
  --queue example-queue-name \
  --limit=300 \
  --product-limit=100

datacube-alchemist redrive-to-queue

Redrives messages from an SQS queue.

All the messages in the specified queue are re-transmitted to either their original queue or the specified TO-QUEUE.

Be careful when manually specifying TO-QUEUE, as it's easy to mistakenly push tasks to the wrong queue, eg. One that will process them with an incorrect configuration file.

Usage: datacube-alchemist redrive-to-queue [OPTIONS]

  Redrives all the messages from the given sqs queue to their source, or the
  target queue

Options:
  -q, --queue TEXT       Name of an AWS SQS Message Queue  [required]
  -l, --limit INTEGER    For testing, limit the number of tasks to create or
                         process.
  -t, --to-queue TEXT    Url of SQS Queue to move to
  --dryrun, --no-dryrun  Don't actually do real work
  --help                 Show this message and exit.

Example

datacube-alchemist redrive-to-queue \
  --queue example-from-queue \
  --to-queue example-to-queue

datacube-alchemist add-missing-to-queue

Search for datasets that don't have a target product dataset and add them to the queue

If a predicate is supplied, datasets which do not match are filtered out.

The predicate is a Python expression that should return True or False, which has a single dataset available as the variable d.

Example predicates:

  • d.metadata.gqa_iterative_mean_xy <= 1
  • d.metadata.gqa_iterative_mean_xy and ('2022-06-30' <= str(d.center_time.date()) <= '2023-07-01')
  • d.metadata.dataset_maturity == "final"
Usage: datacube-alchemist add-missing-to-queue [OPTIONS]

  Search for datasets that don't have a target product dataset and add them to
  the queue

  If a predicate is supplied, datasets which do not match are filtered out.

  Example predicate:  - 'd.metadata.gqa_iterative_mean_xy <= 1'

Options:
  --predicate TEXT        Python predicate to filter datasets. Dataset is
                          available as "d"
  -c, --config-file TEXT  The path (URI or file) to a config file to use for the
                          job  [required]
  -q, --queue TEXT        Name of an AWS SQS Message Queue  [required]
  --dryrun, --no-dryrun   Don't actually do real work
  --help                  Show this message and exit.

Configuration File

A YAML file with 3 sections:

Datacube Alchemist requires a configuration file in YAML format, to setup the Algorithm or Transformation, the input Dataset/s, as well as details of the outputs including metadata, destination and preview image generation.

The configuration file has 3 sections. specification sets up the input ODC product, data bands and configured algorithm to run . output sets where the output files will be written, how the preview image will be created, and what extra metadata to include. processing can help configure the tasks memory and CPU requirements..

Specification (of inputs)

Defines the input data and the algorithm to process it.

product or products Names of the

measurements: [list] of measurement names to load from the input products

measurement_renames: [map] rename measurements from the input data before passing to the transform

transform: [string] fully qualified name of a Python class implementing the transform

transform_url: [string] Reference URL for the Transform, to record in the output metadata

override_product_family: Override part of the metadata (should be in )

basis: ????

transform_args: [map] Named arguments to pass to the Transformer class

Full example specification

specification:
  products:
    - ga_ls5t_ard_3
    - ga_ls7e_ard_3
    - ga_ls8c_ard_3
  measurements: ['nbart_blue', 'nbart_green', 'nbart_red', 'nbart_nir', 'nbart_swir_1', 'nbart_swir_2', 'oa_fmask']
  measurement_renames:
    oa_fmask: fmask

  aws_unsigned: False
  transform: wofs.virtualproduct.WOfSClassifier
  transform_url: 'https://github.com/GeoscienceAustralia/wofs/'
  override_product_family: ard
  basis: nbart_green

  transform_args:
    dsm_path:  's3://dea-non-public-data/dsm/dsm1sv1_0_Clean.tiff'

Transform Class Implementation

License

Apache License 2.0

Copyright

© 2021, Open Data Cube Community

datacube-alchemist's People

Contributors

alexgleith avatar ariana-b avatar dependabot[bot] avatar dunkgray avatar emmaai avatar erin-telfer avatar gypsybojangles avatar james-blitzm avatar jeremyh avatar matthewja avatar mmochan avatar omad avatar pindge avatar pre-commit-ci[bot] avatar spacemanpaul avatar uchchwhash avatar whatnick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datacube-alchemist's Issues

Enforce Dataset Filters via Configuration File

At the moment, DEA is filtering datasets that get sent to the Queue which Alchemist is processing. For example, datasets with low geometric accuracy, or low maturity aren't supposed to be processed and aren't added to the queue.

While this works okay most of the time, it has several flaws:

  • The configuration is split between the config file, and the infrastructure which is stored somewhere completely different.
  • It's hard to reprocess, since the filtering must be done by hand.
  • It's possible to process datasets which shouldn't be processed!

We need to allow configuring a dataset filter inside the configuration file.

A possible way of doing this is using the same --predicate python form as is used on the add-missing-to-queue CLI mode.

Stability report

Want to add automated testing/reporting that specifically assesses how consistent the output products are with past releases.

  • Use gdalcompare to compare pixels of sample WO (and other) datasets in the existing DEA collection, against regenerated versions (freshly derived from the ARD in the same collection).
  • Have CI that summarises the metrics (count of non-identical pixels, and max value change) in a comment for each open PR.
  • Block merges pending completion of said CI

The motivation is that for the efficient maintenance of a plurality of operationalised alchemist pipelines, all pipelines should be kept using the most current stable release of alchemist. (This facilitates refactoring deployments to be modular, sharing rather than duplicating infracode so as to prevent unintended divergence that leads to redundant debugging labours. And it facilitates upstream patches and API compatibility changes, particularly ODC schema changes, to be deployed in a timely manner. It's impracticable to backport hotfixes and provide long term support of multiple release branches.) This requires a high level of assurance that continuously-integrated changes will not unintentionally affect scientific quality.

A pixel wise comparison gives a high level of confidence that change to orchestration has not altered scientific qualities. It's more useful as a report rather than as a test. It only assesses stability of the output, and can't say which version is better. Sometimes output changes may be expected (due to ARD reprocessing, or performance optimisations such as order-of-operations changes that produce floating-point discrepancies which are scientifically insignificant), and so an arbitrary hard threshold (a blocking test) might promote habitual circumvention, whereas a score attached to each PR seems more likely to encourage consideration of its value before merging. A pixel wise comparison deliberately excludes noise associated with changes in file header, compression, etc (e.g., as the format driver version is incremented).

(Note since WO is a multidimensional categorical bitfield, the max value change is a poor metric, although it may be better suited for FC. In future, could consider a more aware treatment of nodata, if a product such as WO has multiple values that represent nodata.)

Remove surplus dependencies from setup.py

Alchemist has extra libraries listed in setup.py that it does not depend on,
and cause issues when installing it.

Especially:

  • h5py
  • numexpr
  • scipy
  • ephem

Removing them might break some docker images... but we need to remove them.

Publish on PyPI

We didn't to start with, because this was a prototype.

It's safe to say that's no longer the case.

It should be possible to copy some configuration from another ODC repo.

Broken link in docs

Main readme links to datacube search documentation (to explain syntax supported by alchemist), but link is dead.

Improve Documentation

Ensure:

  • A clear description of the 5 different Alchemist Commands.
  • Describe the DEA deployment mode (using AWS SQS)
  • Documentation available both from the CLI and in the GitHub README, covering all available options.

Released versions don't say they're the right version

A release build is pushed when we tag a branch, for example, 0.3.7.

If you run a docker image from this release and check the version, it'll be a version like '0.3.8.dev0+gb57bbac.d20201217', which is not right.

We need to fix it to say it's the right version.

processing pod crashloopbackoff need error handling

Problem

alchemist deployments are largely sitting in Crashlookbackoff state. Ideally it should gracefully complete if there it cannot start properly due to no available message in queue.

The number next to Crashloopbackoff is the number of times the pod was started,

alchemist-s2-nrt-processing-wo-5655f489bd-4xwj5   0/1     CrashLoopBackOff   1271       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-7kcbw   0/1     CrashLoopBackOff   1268       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-7x7wm   0/1     CrashLoopBackOff   439        45h
alchemist-s2-nrt-processing-wo-5655f489bd-c7twg   0/1     CrashLoopBackOff   1312       6d
alchemist-s2-nrt-processing-wo-5655f489bd-crsch   0/1     CrashLoopBackOff   1313       6d
alchemist-s2-nrt-processing-wo-5655f489bd-fmqj9   0/1     CrashLoopBackOff   1274       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-gb7cp   0/1     CrashLoopBackOff   1270       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-grlq7   0/1     CrashLoopBackOff   437        45h
alchemist-s2-nrt-processing-wo-5655f489bd-l56pd   0/1     CrashLoopBackOff   1317       6d
alchemist-s2-nrt-processing-wo-5655f489bd-m75cb   0/1     CrashLoopBackOff   1317       6d
alchemist-s2-nrt-processing-wo-5655f489bd-mhrl6   0/1     CrashLoopBackOff   1216       5d12h
alchemist-s2-nrt-processing-wo-5655f489bd-nn8mq   0/1     CrashLoopBackOff   1272       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-pjsj2   0/1     CrashLoopBackOff   1314       6d
alchemist-s2-nrt-processing-wo-5655f489bd-qhgjl   0/1     CrashLoopBackOff   1272       5d18h
alchemist-s2-nrt-processing-wo-5655f489bd-qp2m5   0/1     CrashLoopBackOff   1315       6d
alchemist-s2-nrt-processing-wo-5655f489bd-rgm65   0/1     CrashLoopBackOff   436        45h
alchemist-s2-nrt-processing-wo-5655f489bd-rhdnk   0/1     CrashLoopBackOff   335        34h
alchemist-s2-nrt-processing-wo-5655f489bd-rq7lv   0/1     CrashLoopBackOff   338        34h
alchemist-s2-nrt-processing-wo-5655f489bd-w8smj   0/1     CrashLoopBackOff   438        45h
alchemist-s2-nrt-processing-wo-5655f489bd-wh5vl   0/1     CrashLoopBackOff   1317       6d

Proposed solution

need to add exception handling here

def run_from_queue(config_file, queue, limit, queue_timeout, dryrun, sns_arn):
"""
Process messages from the given queue
"""
alchemist = Alchemist(config_file=config_file)
tasks_and_messages = alchemist.get_tasks_from_queue(queue, limit, queue_timeout)
errors = 0
successes = 0
for task, message in tasks_and_messages:

Add support for `gqa_eo`

Currently explicitly supported metadata types are eo, eo_plus, and eo_s2_nrt. Sentinel 2 ARD uses gqa_eo and this is not explicitly supported in _utils.py.

pre-commit checks are failing

Including:

  • formatters are converting python files to double quotes and back to single quotes
  • yamllint is generally more trouble than it's worth

Examples not working within fresh Docker container

I'm not sure if this is expected to work without additional configuration, but it looks like a package may be missing from the built Docker image?

root@9429c110f08e:/code# datacube-alchemist run-many \
>   --config-file ./examples/c3_config_wo.yaml \
>   --limit=2 \
>   --dryrun \
>   time in 2020-01
Traceback (most recent call last):
  File "/env/bin/datacube-alchemist", line 33, in <module>
    sys.exit(load_entry_point('datacube-alchemist', 'console_scripts', 'datacube-alchemist')())
  File "/env/bin/datacube-alchemist", line 22, in importlib_load_entry_point
    for entry_point in distribution(dist_name).entry_points
  File "/env/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 833, in distribution
    return Distribution.from_name(distribution_name)
  File "/env/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 448, in from_name
    raise PackageNotFoundError(name)
importlib_metadata.PackageNotFoundError: No package metadata was found for datacube-alchemist

final error: ValueError: 'pv' is not in list

I'm connecting to this db;
db_hostname: agdcstaging-db.nci.org.au
db_port: 6432
db_database: ard_interop

datacube-alchemist run-many -E ard_interop --limit 1 fc_config.yaml [product: ls5_ard]

(py)[vdi-n4:aws_fc_testing] datacube-alchemist run-many -E ard_interop --limit 1 fc_config.yaml
2019-08-14 11:48.16 started dask                   dask_client=<Client: scheduler='tcp://127.0.0.1:42524' processes=8 cores=8>
2019-08-14 11:48.16 processing task stream
2019-08-14 11:48.19 data loaded                    task=AlchemistTask(dataset=Dataset <id=279f62b2-7d0a-4fe8-80ba-b0e235dcc0b3 type=ls5_ard location=file:///g/data1b/if87/ARD_interoperability/ga-packaged_collection/1986-08-17/LT50950741986229ASA00/ARD-METADATA.yaml>, settings=AlchemistSettings(specification=Specification(product='ls5_ard', measurements=['nbart_green', 'nbart_red', 'nbart_nir', 'nbart_swir_1', 'nbart_swir_2'], transform='fc.virtualproduct.FractionalCover', measurement_renames={'nbart_green': 'green', 'nbart_red': 'red', 'nbart_nir': 'nir', 'nbart_swir_1': 'swir1', 'nbart_swir_2': 'swir2'}, transform_args={'regression_coefficients': {'blue': [0.00041, 0.9747], 'green': [0.00289, 0.99779], 'red': [0.00274, 1.00446], 'nir': [4e-05, 0.98906], 'swir1': [0.00256, 0.99467], 'swir2': [-0.00327, 1.02551]}}), output=OutputSettings(location='/g/data/u46/users/dsg547/data/c3-testing/', dtype=dtype('uint8'), nodata=255, preview_image=['pv', 'npv', 'bs'], metadata={'product_family': 'fractional_cover', 'producer': 'ga.gov.au', 'dataset_version': '2.0.0'}, properties={'dea:dataset_maturity': 'interim'}), processing=ProcessingSettings(dask_chunks={'x': 1000, 'y': 1000}, dask_client={})))

distributed.worker - WARNING -  Compute Failed
Function:  lump_proc
args:      ((AlchemistTask(dataset=Dataset <id=279f62b2-7d0a-4fe8-80ba-b0e235dcc0b3 type=ls5_ard location=file:///g/data1b/if87/ARD_interoperability/ga-packaged_collection/1986-08-17/LT50950741986229ASA00/ARD-METADATA.yaml>, settings=AlchemistSettings(specification=Specification(product='ls5_ard', measurements=['nbart_green', 'nbart_red', 'nbart_nir', 'nbart_swir_1', 'nbart_swir_2'], transform='fc.virtualproduct.FractionalCover', measurement_renames={'nbart_green': 'green', 'nbart_red': 'red', 'nbart_nir': 'nir', 'nbart_swir_1': 'swir1', 'nbart_swir_2': 'swir2'}, transform_args={'regression_coefficients': {'blue': [0.00041, 0.9747], 'green': [0.00289, 0.99779], 'red': [0.00274, 1.00446], 'nir': [4e-05, 0.98906], 'swir1': [0.00256, 0.99467], 'swir2': [-0.00327, 1.02551]}}), output=OutputSettings(location='/g/data/u46/users/dsg547/data/c3-testing/', dtype=dtype('uint8'), nodata=255, preview_image=['pv', 'npv', 'bs'], metadata={'product_family': 'fractional_cover', 'producer': 'ga.gov.au', 'dataset_
kwargs:    {}
Exception: ValueError("'pv' is not in list",)

Traceback (most recent call last):
  File "/g/data/u46/users/dsg547/py/current/bin/datacube-alchemist", line 11, in <module>
    load_entry_point('datacube-alchemist', 'console_scripts', 'datacube-alchemist')()
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/cli.py", line 38, in run_many
    execute_with_dask(client, tasks)
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/worker.py", line 102, in execute_with_dask
    for result in completed:
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/_dask.py", line 80, in dask_compute_stream
    yield from yy.result()
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/distributed/client.py", line 195, in result
    six.reraise(*result)
  File "/g/data/v10/public/modules/dea-env/20181015/lib/python3.6/site-packages/six.py", line 692, in reraise
    raise value.with_traceback(tb)
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/_dask.py", line 49, in lump_proc
    return [func(d) for d in dd]
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/_dask.py", line 49, in <listcomp>
    return [func(d) for d in dd]
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/worker.py", line 124, in execute_task
    output_data = transform.compute(data)
  File "/g/data1a/u46/users/dsg547/sandpit/fc/fc/virtualproduct.py", line 61, in compute
    fc.append(fractional_cover(data.sel(**s), measurements, self.regression_coefficients))
  File "/g/data1a/u46/users/dsg547/sandpit/fc/fc/fractional_cover.py", line 104, in fractional_cover
    dataset = Datacube.create_storage({}, nbar_tile.geobox, measurements, data_func)
  File "/g/data1a/u46/users/dsg547/sandpit/datacube-core/datacube/api/core.py", line 453, in create_storage
    data = data_func(measurement)
  File "/g/data1a/u46/users/dsg547/sandpit/fc/fc/fractional_cover.py", line 96, in data_func
    i = band_names.index(src_var)
ValueError: 'pv' is not in list

run-one is crashing with AttributeError: 'generator' object has no attribute 'type'

/g/data/v10/public/modules/dea-env/20190709/bin/python /g/data/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/cli.py run-one -E odc_conf_test examples/nci_ls8c_l2_c2tofc.yaml /g/data/u46/users/dsg547/db_data/deafrica-usgs-c2-data_usgs_ls8c_level2_2_201_047_2018_01_04/usgs_ls8c_level2_2-0-20190830_201047_2018-01-04.odc-metadata.yaml
Traceback (most recent call last):
File "/g/data/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/cli.py", line 201, in
cli_with_envvar_handling()
File "/g/data/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/cli.py", line 27, in cli_with_envvar_handling
cli(auto_envvar_prefix='ALCHEMIST')
File "/g/data/v10/public/modules/dea-env/20190709/lib/python3.6/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/g/data/v10/public/modules/dea-env/20190709/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/g/data/v10/public/modules/dea-env/20190709/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/g/data/v10/public/modules/dea-env/20190709/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/g/data/v10/public/modules/dea-env/20190709/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/g/data/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/cli.py", line 80, in run_one
execute_task(task)
File "/g/data/u46/users/dsg547/sandpit/datacube-alchemist/datacube_alchemist/worker.py", line 87, in execute_task
basis=task.settings.specification.basis)
File "/g/data1a/u46/users/dsg547/sandpit/datacube-core/datacube/testutils/io.py", line 99, in native_load
geobox = native_geobox(ds, measurements, basis) # early exit via exception if no compatible grid exists
File "/g/data1a/u46/users/dsg547/sandpit/datacube-core/datacube/testutils/io.py", line 65, in native_geobox
gs = ds.type.grid_spec
AttributeError: 'generator' object has no attribute 'type'

Process finished with exit code 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.