Code Monkey home page Code Monkey logo

data-relay's Introduction

logo

Easily send and receive application data with major cloud providers or other hosts via messaging or specialized services.

Highlights

  • Messaging: Send/Receive messages with other hosts
  • InfluxDB: Send data to the popular time series database
  • Major cloud providers: AWS, Azure or Google Cloud Pub/Sub

Documentation

Head over to our docs to see how to get started and understand how it works.

Motivation

concept

Cloud providers include a variety of services to consume and provide application data, but each works differently. The Data Relay block provides a common, simple way to exchange application data with the cloud or other hosts.

This project is in active development so if you have any feature requests or issues please submit them here on GitHub. PRs are welcome, too.

License

Data Relay block is free software, and may be redistributed under the terms specified in the license.

data-relay's People

Contributors

gelbal avatar kb2ma avatar mpous avatar mtoman avatar phil-d-wilson avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

data-relay's Issues

Add per-service versioning?

A dapr YAML file includes two attributes for versioning, apiVersion and spec->version. We should establish a policy or game out a design to handle new versions for a YAML file that affect the variables that a balena user must define. We should try to avoid the need for a disruptive, flag day style change in the future.

Namespace cloud block environment variables

User must define these variables so let's make names consistent and clear. They will define variables for cloud services as well.

MQTT_INPUT --> CLOUD_BLOCK_INPUT_TOPIC
DAPR_DEBUG --> CLOUD_BLOCK_DEBUG

Remove non-output-components.txt

The file non-output-components.txt is used to identify the components which do not send output to the cloud provider. These components include the MQTT input component as well as the secret stores.

We must identify these components so that we do not push output data to them. This file was intended as a shortcut to reviewing the definitions of all component files. To identify non-output components without the file we must perform the search below.

For now, rather than use a file or execute the search below, we should just hard-code the list of non-output components.

  • Create a list to contain blacklisted components
  • For each .py file in the plugins directory:
    • If the TYPE attribute is not "output", read the FILE attribute for the YAML file
    • Open the YAML file and parse its contents to find the metadata.name attribute
    • Add the name to the list of blacklisted components

Add RESTful topic names

Presently we have only two MQTT topics -- relay-out and relay-in by default. Really though these names are adapters for use by user containers. We really need a scheme for naming topics to provide clarity and allow for use of multiple inputs or outputs.

We plan to use a RESTful approach. So for example to output to an AWS SQS queue, we should have a default queue name like aws-sqs.

dapr Binding vs. PubSub

Presently we use the dapr Binding API for most/all cloud services. This API seems to focus more on a specific resource, like a specific PubSub topic, to allow both read and write for values. In other words the API is a bidirectional binding to a particular resource.

However, a user's application may wish to write to different PubSub topics rather than just one. The dapr Binding API does not support this use case, but the dapr PubSub API does support it. In this case the user must somehow specify the topic for the cloud with each MQTT message incoming to the cloud block on the device.

This use case requires some design to work as simply as possible for the user. We must consider how the topics are named incoming to the cloud block on the device, as well as the service variable for the cloud, like GCP_PUBSUB_TOPIC for the GCP Pub/Sub binding.

/app/components stores sensitive data on disk

The /app/components directory contains the dapr component configuration files for the cloud container. These files include secrets to access the components on the cloud. We'd like to minimize access to these files to make them more secure.

Make use of common environment variables consistent

Presently the AWS service configuration reuses variables for AWS access key and secret key for both the SQS and S3 services. However, the Azure and GCP service configurations always define unique variables for each service. See the links in the Environment Variables section of the README.md to compare.

We should have a consistent and simple policy for reuse of common parameters. "Consistency" of course is in the mind of the beholder, and requires an understanding of how the variables are used in each service.

For GCP consider consolidating some of the variables, like URIs, which are unlikely to change. Verify with GCP rep.

Document certificate key file handling

Currently users will need to copy their pfx file into the /app/components/secrets directory in their dockerfile. We need to explain this in the readme - or a separate documentation file.

Update Landr docs

In general we want documentation to support both the Landr and balena-io styles. Landr docs have become a little outdated since initial development. README.md could use a bit more fleshing out. ARCHITECTURE.md also could use more detail.

This doc also serves to describe MVP and roadmap.

Runtime fails when loading dapr grpc on fincm3 from Alpine build

The main cloud block runtime fails when loading the dapr grpc package on fincm3 (armv7). The error is shown below. This error started to occur after merge of #41 to use an Alpine-python-build image to generate files for an Alpine-python-run image. The error does not occur with amd64 or arm64 builds.

I did find a similar issue in the grpc project that was resolved, but it included the discouraging comment:

Building and running in python alpine arm images has always posed issues with grpcio.

Traceback (most recent call last):
12.05.21 08:08:03 (-0400)  cloud    File "./src/main.py", line 6, in <module>
12.05.21 08:08:03 (-0400)  cloud      from dapr.ext.grpc import App, BindingRequest
12.05.21 08:08:03 (-0400)  cloud    File "/root/.local/lib/python3.8/site-packages/dapr/ext/grpc/__init__.py", line 8, in <module>
12.05.21 08:08:03 (-0400)  cloud      from dapr.clients.grpc._request import InvokeMethodRequest, BindingRequest
12.05.21 08:08:03 (-0400)  cloud    File "/root/.local/lib/python3.8/site-packages/dapr/clients/__init__.py", line 12, in <module>
12.05.21 08:08:03 (-0400)  cloud      from dapr.clients.grpc.client import DaprGrpcClient, MetadataTuple, InvokeMethodResponse
12.05.21 08:08:03 (-0400)  cloud    File "/root/.local/lib/python3.8/site-packages/dapr/clients/grpc/client.py", line 11, in <module>
12.05.21 08:08:03 (-0400)  cloud      import grpc  # type: ignore
12.05.21 08:08:03 (-0400)  cloud    File "/root/.local/lib/python3.8/site-packages/grpc/__init__.py", line 23, in <module>
12.05.21 08:08:03 (-0400)  cloud      from grpc._cython import cygrpc as _cygrpc
12.05.21 08:08:03 (-0400)  cloud  ImportError: Error loading shared library ld-linux-armhf.so.3: No such file or directory (needed by /root/.local/lib/python3.8/site-packages/grpc/_cython/cygrpc.cpython-38-arm-linux-gnueabihf.so)

I tried adding the libc6-compat package when running install_packages in the Dockerfile because that package contains ld-linux-armhf.so.3. The error then changes to:

[Logs]    [5/12/2021, 9:56:11 AM] [cloud] ImportError: Error relocating /root/.local/lib/python3.8/site-packages/grpc/_cython/cygrpc.cpython-38-arm-linux-gnueabihf.so: __xstat: symbol not found

Expose component dapr endpoints

This will enable file uploads to blob, bob GET operations and services using bindings via HTTP rather than the MQTT route

Attempting to run daprd with secret components, when no Azure KeyVault details added

If no secret store is configured, the AzureSecretsKeyvault plugin exits correctly. But it doesn't return a status to show it did not configure a YAML file. daprd is then run to connect to a secret store that we have no component for, with a huge error:

25.03.21 20:43:46 (+0000)  cloud  Looking for azureehconnectionstring
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureehconsumergroup
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureehstorageaccount
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureehstorageaccountkey
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureehcontainername
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureblobstorageaccount
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureblobstorageaccountkey
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  Looking for azureblobcontainername
25.03.21 20:43:46 (+0000)  cloud  DEBU[0003] {ERR_SECRET_STORES_NOT_CONFIGURED secret store is not configured}  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime.http type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  Getting secrets failed with status code %d
25.03.21 20:43:46 (+0000)  cloud  {'errorCode': 'ERR_SECRET_STORES_NOT_CONFIGURED', 'message': 'secret store is not configured'}
25.03.21 20:43:46 (+0000)  cloud  INFO[0003] dapr shutting down. Waiting 5 seconds to finish outstanding operations  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime type=log ver=1.0.1
25.03.21 20:43:46 (+0000)  cloud  INFO[0003] stop command issued. Shutting down all operations  app_id=cloudBlock instance=ca8fd92 scope=dapr.runtime type=log ver=1.0.1

The block then runs normally.

We can remove this error by getting the AzureSecretsKeyvault to return a success status - and only run dapr to use the secretstore if one has been successfully configured.

Refactor collection/use of component configuration parameters

While working on GCP secret store support (#16), we found that the content of a variable value can be difficult to manage with the current scheme. In particular when reading a secret from a secret store, we push the secret as an OS variable back into the environment for the script running getSecrets.py. This approach fails when the variable includes a space.

Essentially the current implementation uses the OS environment as a dictionary for the Python processes, which is inconvenient and prone to these sort of issues. In addition, presently three Python processes are executed sequentially to retrieve and set up the dapr component configuration files. This approach makes it difficult to manage configuration holistically.

So to reliably retrieve, store and manage component configuration parameters to ultimately write the dapr configuration files, we plan to use a single Python process. This process will be required to spawn an intermediate dapr process to retrieve parameters from a configured secret store. See the outline below.

  1. Read in environment variables into an internal dictionary of name/value pairs
  2. If variables read are sufficient for a secret store, read expected variables from the secret store
    1. Write dapr component configuration file with secret store variables
    2. Spawn daprd from Python process to access secret store
    3. If an expected variable is available in the secret store and not already defined in the internal dictionary, add it. (environment variables have preference)
  3. Write dapr component configuration files for components with all required variables defined

Add support for other architectures

Currently the dockerfile has amd64 specific lines:

RUN wget -q https://github.com/dapr/dapr/releases/download/v1.0.1/daprd_linux_amd64.tar.gz
RUN tar -zxvf daprd_linux_amd64.tar.gz

This should be replaced with a downloaded script to install the correct dapr runtime for each archtecture.

Upgrade/validate component versions

Components of the cloud block continue to develop. Upgrade and validate component versions before releasing cloud block.

Component Used version Latest version Comments
dapr 1.0.1 1.1.2+ Keep in mind that dapr requires Python 3.7+. We advertise use of a 3.8 tagged base image in Dockerfile.template.
dapr, dapr-ext-grpc Python modules latest 1.1.0+ Must these stay coordinated somehow with main dapr container? Do we need to pin the Python modules?
eclipse-mosquitto MQTT 1.6.9 1.6.14+, 2.0.10+ In a brief test I was unable to use 2.x but did not investigate the cause.

Prepared yaml files in /app/components not used

A user may prepare yaml files for a cloud component which contain the configuration required by dapr -- rather than specifying environment variables. We expect the user to put these files in the /app/components/ directory in a source checkout. /sensor/Dockerfile.template then should copy these files to the /app/components/ directory in the sensor image. However, presently these files are not present in the running container as expected.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.