Code Monkey home page Code Monkey logo

cumulus-message-adapter's Introduction

Cumulus Message Adapter

CircleCI

cumulus-message-adapter is a command-line interface for preparing and outputting Cumulus Messages for Cumulus Tasks. cumulus-message-adapter helps Cumulus developers integrate a task into a Cumulus Workflow.

Read more about how the cumulus-message-adapter works in the CONTRACT.md.

Releases

Release Versions

Please note the following convention for release versions:

X.Y.Z: where:

  • X is an organizational release that signifies the completion of a core set of functionality
  • Y is a major version release that may include incompatible API changes and/or other breaking changes
  • Z is a minor version that includes bugfixes and backwards compatible improvements

Continuous Integration

CircleCI manages releases and release assets.

Whenever CircleCI passes on the master branch of cumulus-message-adapter and message_adapter/version.py has been updated with a version that doesn't match an existing tag, CircleCI will:

  • Create a new tag with tag_name of the string in message_adapter/version.py
  • Create a new release using the new tag, with a name equal to tag_name (equal to version).
  • Build a cumulus-message-adapter.zip file and attach it as a release asset to the newly created release. The zip file is created using the Makefile in the root of this repository.

These steps are fully detailed in the .circleci/config.yml file.

Development

Dependency Installation

pip install -r requirements-dev.txt
pip install -r requirements.txt

Running Tests

Running tests requires localstack.

Tests only require localstack running S3, which can be initiated with the following command:

EAGER_SERVICE_LOADING=1 SERVICES=s3 localstack start

And then you can check tests pass with the following nosetests command:

CUMULUS_ENV=testing nose2 -v

Linting

pylint message_adapter

Contributing

If changes are made to the codebase, you can create the cumulus-message-adapter zip archive for testing libraries that require it:

make clean
make cumulus-message-adapter.zip

Then you can run some integration tests:

./examples/example-node-message-adapter-lib.js

Before any changes are finalized and released, they should be tested by packaging the cumulus-message-adapter zip archive and testing it in a lambda environment, as that is where it will be utilized.

Packaging

Packaging the zip file is probably best done in an environment that closely matches the lambda environment in which it will be run and contains the current Python version, so we are using an AWS Python Lambda image. Certain packages need to be installed, and using a virtual environment is important due to Python pathing.

docker run -v ~/projects/cumulus-message-adapter/:/cma/ -v ~/tmp/:/tmp/ -v ~/amazon/:/home/amazon/ -it --entrypoint /bin/bash amazon/aws-lambda-python:3.10
yum install -y make binutils zip
cd /cma
pip install --user virtualenv
~/.local/bin/virtualenv ~/venv310
. ~/venv310/bin/activate
pip install .
make clean
make cumulus-message-adapter.zip

Testing the package in a Lambda Environment

Once the package is created, it should be tested in a Lambda environment. Before doing so, it may be helpful to run the package in the container it was packaged in, immediately after the above commands to see if any errors occur, which will indicate an issue in creating the package ./dist/cma stream.

If no errors occur immediately, you can optionally test the zip in an AWS Lambda NodeJS image, as that is the target environment. Running in an image may allow for quicker testing and development, but testing in AWS should still be the final test.

docker run -v ~/projects/cumulus-message-adapter:/zipfile --entrypoint /
bin/bash -it amazon/aws-lambda-nodejs:16
cd /zipfile
cp -r dist /opt/
cd /opt/dist
./cma stream

Testing the package in AWS Lambda requires uploading the zip as a layer and then running a Cumulus step function that utilizes that layer. The following instructions are for Cumulus Core team members that have access to a layer specifically set up for this purpose.

  • In the AWS console, go to Lambda > Layers > CMA_Test
  • Create a new version by uploading the cumulus-message-adapater zip file packaged earlier
  • In your /cumulus-tf/terraform.tfvars, replace the cumulus_message_adapter_lambda_layer_version_arn value with the newly created Version ARN
  • Apply the change with terraform apply
  • Find any recent successfully run Step Function, and run a New Execution. The 'Functions using this version' tab of the CMA_TEST layer should provide some options.

Troubleshooting

  • Error: "DistutilsOptionError: must supply either home or prefix/exec-prefix โ€” not both" when running make cumulus-message-adapter.zip

cumulus-message-adapter's People

Contributors

abarciauskas-bgse avatar charleshuang80 avatar cumulus-bot avatar etcart avatar flamingbear avatar jennyhliu avatar jkovarik avatar laurenfrederick avatar markdboyd avatar matthewhanson avatar nemreid avatar npauzenga avatar reweeden avatar sethvincent avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cumulus-message-adapter's Issues

NaN values cause AWS Stepfunction parsing error

It seems that the way the python JSON serializer serializes NAN values is not understood by AWS step functions.

I saw this error in one of our executions recently:
image

And it was caused by a parsing error in the payload from the cumulus message adapter because one of the fields in our meta section had a NAN value.

image

I confirmed that the Json serializer expects this format:

>>> json.dumps({"foo": float("nan")})
'{"foo": NaN}'
>>> json.loads(json.dumps({"foo": float("nan")}))
{'foo': nan}

CMA output name changes for an array

We ran into a situation where different lambda functions from the same workflow coming from different component versions with different schema.

One solution is to change the names using CMA as following,

"task_config": {
  "cumulus_message": {
      "outputs": [
        {
          "source": "{$.payload.granules[0].files[0].filepath}",
          "destination": "{$.payload.granules[0].files[0].key}}"
        },
        {
          "source": "{$.payload.granules[0].files[1].filepath}",
          "destination": "{$.payload.granules[0].files[1].key}}"
        },
        {
          "source": "{$.payload.granules[1].files[0].filepath}",
          "destination": "{$.payload.granules[1].files[0].key}}"
        },
        {
          "source": "{$.payload.granules[1].files[1].filepath}",
          "destination": "{$.payload.granules[1].files[1].key}}"
        }
      ]
    }
}

However, as shown above, since this name change involves multiple array items, we have to write a long list to iterate through them. More importantly, the array size which can only be determined at run-time, and the above solution will not work.

Regarding this, I noticed that array support is indeed implemented in CMA input:

  "task_config": {
    "inlinestr": "prefix{meta.foo}suffix",
    "array": "{[$.meta.foo]}",
    "object": "{$.meta}"
  },

and by looking into the CMA source code, I found that actually the array is resolved based on jsonpath, and therefore the following will also work:

    "array": "{[$.meta.array[*].foo]}",

however, it's only implemented in the input, but not for output name changes.

So, it will be nice (and symmetric) if the jsonpath-based name changes can be implemented in output, so that the above long task config can be written as following to make it short and dynamic:

"task_config": {
  "cumulus_message": {
      "outputs": [
        {
          "source": "{[$.payload.granules[*].files[*].filepath]}",
          "destination": "{[$.payload.granules[*].files[*].key}]}"
        }
      ]
    }
}

jsconschema 3 broken

cumulus-message-adapter has a requirement of

jsonschema>=2.6.0

So the latest, jsonschema 3 is installed, which is broken due to this bug:
pyinstaller/pyinstaller#4100

When trying to load the lambda handler that imports cumulus_message_adapter_python it fails with a module initialization error:

module initialization error The 'jsonschema' distribution was not found and is required by the application

suggesting it can't find jsonschema, but is actually due to that bug.

This happens in the case of using it in a deployed Python package, though when I tried to import normally it worked, so not completely sure why, but reverting to jsonschema 2.6.0 works.

So jsonschema should be pegged to 2.6.0, or with

jsonschema~=2.6.0

which will installed the latest compatible version, but not a newer major version.

Output array mapping

It seems that CMA currently doesn't allow output mapping for array objects, e.g.

          "source": "{[$.granules[*].files[*].filename]}",
          "destination": "{[$.payload.granules[*].files[*].NewFileName]}"

I think this can be useful in general and can be implemented.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.