vmware-archive / dispatch Goto Github PK

View Code? Open in Web Editor NEW

532.0 48.0 58.0 29.74 MB

Dispatch is a framework for deploying and managing serverless style applications.

Home Page: http://dispatchframework.io

License: Apache License 2.0

Shell 8.82% Makefile 1.10% Go 89.29% Python 0.38% Dockerfile 0.18% HTML 0.22%

dispatch faas serverless kubernetes functions

dispatch's Issues

Cannot get runs for functions after they have been deleted

dispatch get runs returns a list of function executions. These contain both current and deleted functions. However, trying to get specific runs for a function which has been deleted returns a 404. Even though the functions have been deleted, I would expect to get the execution. Either that or, the runs need to be deleted with the function. I would prefer the former.

Run output sometimes "null" (and no status)

I love intermittent issues... anyway often times the run will not contain any output:

$ dispatch get run node-hello-subscribe-13975 dfd2d1f5-f946-4769-a55d-c9c044aff12d --json
{
    "blocking": true,
    "executedTime": 1513127053,
    "finishedTime": 1513127062,
    "functionName": "node-hello-subscribe-13975",
    "input": {
        "name": "Jon",
        "place": "Winterfell"
    },
    "logs": null,
    "name": "dfd2d1f5-f946-4769-a55d-c9c044aff12d",
    "secrets": null
}

If this function did not execute successfully than it should be noted in the payload. I guess another issue here is that we don't know if the function executed successfully or not. Though this particular function does execute successfully on most (all) occasions.

Update hardcoded secrets/passwords to be dynamically created/set by user

there's some hardcoded data, like openwhisk host ip and auth key, rabbitmq username/password, openfaas, just name a few.

we should allow users to configure them with the help of k8s secrets.

testing

/kind bug

Schema validation errors no longer return a helpful error

When executing a function with a schema, if the input is invalid, the function simply responds "null". In the following example, "name" is a required field:

$ dispatch exec hello-js --input='{"name": "Jon", "place": "Winterfell"}' --wait --json | jq
{
  "myField": "Hello, Jon from Winterfell"
}
[dispatch (install)]$ dispatch exec hello-js --input='{"place": "Winterfell"}' --wait --json | jq
null

Add API e2e test

move api-test.sh to e2e/tests/

SDK client library for Dispatch

Currently, there is no single package that would expose client-side functions for easy interaction with Dispatch. CLI uses swagger generated code directly and has many helper functions, useful when interacting with our API. Instead, we should:

extract the logic to interact with our API into separate client package, which will act as an SDK
make CLI use this package instead of using anything directly
Expose the client package, so that third-party integrations can be done easily.

Partially implemented for the purpose of event driver unit tests.

have base-image pushed to vmware/dispatch-openfaas-[nodejs6|python]-xxx

clean up IdentityManager

Add integration/e2e tests for event manager/driver

Update functionality missing from CLI and many API endpoints

All entities should be updatable, and should be manageable through the CLI.

Using secrets might be tricky (debugging function input)

I've defined a secret slack, which had single slack_url key with value. Then in the code, I defined:

 let completeWebhook = context.secrets["slack_url"];

I've got successful execution (no error was thrown), although completeWebhook was empty.

The proper line should be:

 let completeWebhook = context.secrets["slack"].slack_url;

But this wasn't obvious. In normal, local development one could throw a bunch of debugging statements, but this gets tricky when executing via FaaS.

What would be useful is to have an example of input. After declaring a function, have something like:

vs exec function FUNCTION_NAME --debug

Which would execute a function but would print the detailed input passed to the function by FaaS.

Parse & include payload for all vSphere event driver types

Currently, all events are processed, but only a few of them have their extra fields included in the event payload. Using reflection or WSDL spec we should be able to include extra fields for all types of events.

Auth: be able to restrict people access permission

the system admin should be able to control the set of account who has the permission to get access to dispatch.

e.g. only accounts within organization VMware has the right to use the VMware's prod dispatch deployment.

Add meaningful health checks to services

Currently, the healthcheck is a no-op and the /healthz only verifies that the service itself is working. We should add checks that verify dependent services, like DB, docker daemon, message queue, FaaS implementation, etc.

Store function invocation records in time series database

We need to record function invocations and their results. The nature of the data means that it isn't suitable for the entity store, and would probably be better stored in a time series database like influxdb. This feature should go hand-in-hand with exposing the function invocation data (results/logs) via API and consumed by the UI.

Function metadata and documentation

Looking at a list of functions (dispatch get functions) to look up a function can get you wondering if there was a command to get help for a particular function, kind of like a "man pages" for functions. This goes well along with things like function input/output schemas.

As a function user I'd like to get help on what is the intended use of a function and some example(s) of how to use it. Possible CLI usage:

$ dispatch help my-func

As a function developer I'd like dispatch to generate the help page based on the function meta information which I can conveniently provide in one file (it can be named f.yaml or dispatch.yaml by convention) in the function source directory. This file could have a similar structure to Chart.yaml used for k8s Helm chart metadata. This metadata file would have the links to the function schemas.

If README.md exists in the function source directory, its content should be appended to the generated help page.

API Manager support for customized host names

Integrate Harbor or at least provide instructions/charts

Currently we use artifactory (private and requires vmware network) or docker hub for function images. This is less than ideal, especially if users don't want the function images public. Also, we create a LOT of function images (1 to 1). It would be great to be able to point to a repo deployed with or along side dispatch (harbor for instance).

Reason for function "ERROR" status not propagated to user

When creating a function, if there is an issue (for instance pointing to a non-existent image), the function goes in ERROR state. Unfortunately, the user is left to guess as to why. Each response should contain a "reason" field which contains the relevant error text. From the logs of the function manager:

time="2017-12-10T06:50:25Z" level=error msg="Error when fetching image for function node-hello-with-schema2: failed to get image: 'nodejs6': [GET /{imageName}][404] getImageByNameNotFound  &{Code:404 Message:0xc42025a980}"

I would expect that "failed to get image: 'nodejs6'" would be in the payload for dispatch get function node-hello-with-schema2

Use source dir when creating functions

As a function developer I'd like to have all the function related files (source code, dependencies, schemas, README, metadata, etc) placed in the function source directory.

A possible CLI command to deploy a function would look like:

$ vs create function ./my-func

Add validation of input in event manager

topic length should be limited, characters only from arbitrary set. certain names should be reserved for system events
check if subscriber exists
When event is emitted via CLI/API, return a warning if there is no subscriber

Move UI to github repo

Trace logs emitted even if only --debug flag is set (function-manager)

This bug is specific to the function-manager, but likely in other services as well.

The process is started with the following flags:

"args": [
                            "--config=/data/config/function-manager.json",
                            "--organization=$(ORGANIZATION)",
                            "--host=0.0.0.0",
                            "--port=80",
                            "--db-file=/data/function-manager/db.bolt",
                            "--image-manager=dispatch-image-manager",
                            "--secret-store=dispatch-secret-store",
                            "--faas=openfaas",
                            "--tls-port=443",
                            "--tls-certificate=/data/tls/tls.crt",
                            "--tls-key=/data/tls/tls.key",
                            "--debug"
                        ],

Yet the logs still look like:

time="2017-12-13T00:40:32Z" level=debug msg="[BEGIN] [github.com/vmware/dispatch/pkg/function-manager.(*Handlers).getFunctions:298] StoreGetFunctionsHandler"
time="2017-12-13T00:40:33Z" level=debug msg="[BEGIN] [github.com/vmware/dispatch/pkg/function-manager.functionListToModel:69] functionListToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[BEGIN] [github.com/vmware/dispatch/pkg/function-manager.functionEntityToModel:47] functionEntityToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[END  ] [github.com/vmware/dispatch/pkg/function-manager.functionEntityToModel:47] [1.331787ms] functionEntityToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[BEGIN] [github.com/vmware/dispatch/pkg/function-manager.functionEntityToModel:47] functionEntityToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[END  ] [github.com/vmware/dispatch/pkg/function-manager.functionEntityToModel:47] [427.346µs] functionEntityToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[END  ] [github.com/vmware/dispatch/pkg/function-manager.functionListToModel:69] [387.462921ms] functionListToModel"
time="2017-12-13T00:40:33Z" level=debug msg="[END  ] [github.com/vmware/dispatch/pkg/function-manager.(*Handlers).getFunctions:298] [898.250277ms] StoreGetFunctionsHandler"

API manager support for auth oidc

Use (real) persistent storage for "production" deployments

Bolt DB only really works on a single host installation. In a multi-node kubernetes environment, if pods are rescheduled upon restart, the database is lost. This leads to very inconsistent state.

Either use etcd or consul.

Measure performance of rabbitMQ

Test multiple exchanges/fanout approach vs single exchange/topics approach in terms of speed and resource consumption.

Refactor pkg/config/

it is currently not been used.

it should refactor and use for unit-test (mainly locally) which has outter dependencies (like kong, and openfaas)

I prefer to import unit-test configurable values, like secert/ip/hostname, from a conf file, rather than hard coded in the *-test.go files.

Deal with TextConsumer errors in CLI

If for some reason API returns text content type (e.g. because of some internal error or due to Ingress error), CLI craps out with an unfriendly error message. CLI should be able to at least print the body of text error or catch it and return more usable message.

function names can only be lowercase

Got an error creating a function with uppercase letters (i.e. getVMs). This is the error in the logs. We should accept upper and lower case function names (or at least provide a useful error to the user):

time="2017-11-16T01:47:44Z" level=error msg="Driver error when creating a FaaS function: Error response from daemon: Error parsing reference: \"serverless-docker-local.artifactory.eng.vmware.com/func-getVMs:20171116-014743\" is not a valid repository/tag: repository name must be lowercase
failed to build an image
gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas.(*ofDriver).buildAndPushImage
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas/builder.go:100
gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas.(*ofDriver).processRequests
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas/builder.go:43
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:2337
Error building image for function 'getVMs'
gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas.(*ofDriver).Create
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/functions/openfaas/driver.go:92
gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager.(*Handlers).addFunction
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/handlers.go:231
gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager.(*Handlers).(gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager.addFunction)-fm
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/handlers.go:192
gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/gen/restapi/operations/store.AddFunctionHandlerFunc.Handle
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/gen/restapi/operations/store/add_function.go:23
gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/gen/restapi/operations/store.(*AddFunction).ServeHTTP
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/pkg/function-manager/gen/restapi/operations/store/add_function.go:71
gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware.NewOperationExecutor.func1
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware/operation.go:28
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:1918
gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware.NewRouter.func1
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware/router.go:73
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:1918
gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware.Redoc.func1
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware/redoc.go:72
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:1918
gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware.Spec.func1
	/serverless/src/gitlab.eng.vmware.com/serverless/serverless/vendor/github.com/go-openapi/runtime/middleware/spec.go:45
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:1918
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2619
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1801
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:2337"

Add /healthz endpoints to services

/healthz should be used in liveness/readiness probe. since it's going to be called often, should not produce too much logs noise. Should also be able to verify connectivity to satellite services.

tiny REAME issues

the "initial serverless architecture diagram" link doesn't work
suggest we should have both dispatch-dariwn and -linux in S3, for mac and linux users.

Better name handling for event subscriptions

Name might not be significant for subscription, thus it's not required. Currently, a combination of topic and function name is used, but this may cause conflicts and weird names. Ideas how to solve it:

Use random names generator (a name is a combination of adjective and noun, similar to imgur, etc.)
Use UUID as name

Image manager for system and language dependencies

The intent of the image manager is to manage the system and language dependent packages on a runtime image.

The system libraries are installed on the photon OS based base image. Generally these will be installed via the tdnf tool, however it should also be possible to extend this to support other means of installation (curl?). The developer should be able to specify a manifest file (which we must define) along with a base image to produce an image which functions may be run upon. The original intent was to curate and manage a catalog, but this may be difficult and burdensome. Instead, administrators should be able to manage a blacklist which would prevent specific packages from being installed. Additionally, the image manager must report back all of the packages installed on the image and this information should be stored with the image. This gives administrators a view of exactly what is installed on each image.

The language dependent libraries should operate similarly. The only real difference is that the manifest is already defined (package.json, requirements.txt, etc) by the language ecosystem. The image manager should then parse this manifest and ensure that all packages are approved (i.e. not on the blacklist). Just like above, a list of all installed packages must be reported back and stored with the function. The method for gathering this list, will be language dependent (i.e. pip freeze).

These features should provide adequate flexibility, security and visibility to the software running with the functions.

Audit Log

*_mock.go in event-manager

there's some *_mock in event-manager, suggest to move them into a /mock folder, which is did by other managers.

Control-loop pattern for secrets-store

Functions in non-READY state fail silently

If a function is in non-READY state it should either execute after the function transitions to READY or return a helpful error. Currently, the execution pauses (seemingly waiting till the function is READY), but returns null. Additionally, the message "Function i-have-a-secret finished successfully." is not correct.

I would expect that if the function is in CREATING (or non-ERROR) that the exec would just wait and then return successfully. If in ERROR state, the API should return a helpful error:

$ dispatch create function nodejs6 i-have-a-secret examples/nodejs6/i-have-a-secret.js 
Created function: i-have-a-secret
[dispatch (install)]$ dispatch exec i-have-a-secret --wait
Function i-have-a-secret finished successfully.
null
[dispatch (install)]$ dispatch exec i-have-a-secret --wait
Function i-have-a-secret finished successfully.
{
    "message": "I know nothing"
}

Function execution should return a run ID

This goes for both sync (--wait) and async. Furthermore exec --json should work with and without --wait.

Need way of mocking image manager for function manager

Function manager now has a dependency on the image manager for resolving the image name -> Docker URL. The function manager should still be runnable without the image manager for development and testing. One suggestion would be to accept a JSON file on the command line which includes static/mocked image manager payload.

This has been implemented for unit-tests, but not for running stand-alone.

DNS error when create a function when the function name has digits

if there's numbers in the function name, it lead to a DNS error while creating the k8s service.

dispatch create function base-python3 py-hello-no-schema ../examples/python3/hello.py
Created function: py-hello-no-schema
dispatch create function base-python3 py3-hello-no-schema ../examples/python3/hello.py
[Code: 500] Error: internal server error when creating a Faas function

as we can see from the function manager log:

Driver error when creating a FaaS function: Server returned unexpected status: 400, (py3-hello-no-schema) must be a valid DNS entry for service name

PS: it maybe my k8s cluster issue, as it passes the e2e test.

Consistent error handling in the API

Dynamo DB event driver

Easy integration with Dynamo DB

Add examples to CLI

Polish CLI by adding missing examples to commands

Functions not really READY

Based on e2e test failures, it appears that although functions are in a READY state, they are not actually ready to run yet.

{
    "blocking": true,
    "executedTime": 1513632380,
    "finishedTime": -62135596800,
    "functionId": "35aa1618-59d2-4b78-9bff-c4ca52276b13",
    "functionName": "node-hello-no-schema",
    "input": {
        "name": "Jon",
        "place": "Winterfell"
    },
    "logs": null,
    "name": "ccd1fac2-2b80-4732-987d-0ec5727756e7",
    "reason": [
        "error running function: node-hello-no-schema: Server returned unexpected status code: 500 - Can't reach service: of-35aa1618-59d2-4b78-9bff-c4ca52276b13"
    ],
    "secrets": null,
    "status": "ERROR"
}

This is likely due to the way that OpenFaaS creates functions. The k8s service associated with the function is not actually up and ready by the time the function execution is called. We mark a function READY as soon as it's submitted to OpenFaaS.

Cannot develop UI against remote cluster

The use case is a developer wants to work on the UI and run it against a remotely deployed dispatch cluster. Though, I think the same issues exist even if deployed on a local minikube. Currently the UI only works if actually deployed in the kubernetes cluster. This is due to two things CORS and auth.

The CORS issue can be addressed by simply adding the following annotation on the ingress records for the services: ingress.kubernetes.io/enable-cors: "true"

However, while the request will now pass the CORS pre-flight check, all requests now return a 401 (unauthenticated). There is currently no workaround for this (besides running the UI in the cluster behind the ingress controller and outh2proxy).

The expectation is for a developer to be able to simply runng serve locally and work against a dispatch API.

Migrate e2e tests to bats or some other integration test framework

The demo.sh script does a good job of running our e2e tests, but it should probably be migrated to a framework. BATS would be a relatively easy migration as it's bash and it's simple. This way we can build in retries and other common functions as well as get good report output.

Control-loop pattern for image-manager

Image manager already uses a version of the control-loop, but use shared libraries to follow patterns of function-manager and api-manager.

Need to refer to the secrets via secret name in addition to the key associated with the value.

When secrets are defined, they are given a name. That name then becomes the key in the "secrets" payload injected into the function. This seemed like a good implementation at first, but it prevents you from naming keys "dev" and "prod" and using them interchangeably, for instance.

We need to decouple the name from the key/values. This will of course require a way to resolve conflicts when the same key is defined in multiple secrets.

current login view may be confusing

Currently, login will display GitHub auth page which says:

"Authorize seanhuxy" (see attached screenshot).

We know who Sean is, but other people might think it's some sort of a bug. We should have something like "Authorize dispatch".

Quote from Sean:

as discussed with Berndt, we actually want make the system admin to be able to make decision on which IDP and and the account on IDP they are using, so I am going to work on this issue: to make IDP configurable.

vmware-archive / dispatch Goto Github PK

dispatch's Issues

Recommend Projects

Recommend Topics

Recommend Org