cncf / wg-serverless Goto Github PK

CNCF Serverless WG

License: Apache License 2.0

serverless cncf

wg-serverless's Introduction

CNCF Serverless WG

The original intent of the CNCF Serverless Working Group was to explore the intersection of cloud native and serverless technology. As a result of that work the following artifacts were produced:

A Serverless Overview Whitepaper
Landscape

Since then the working group has expanded its mission to include a set of sub-projects:

CloudEvents (repo)
Serverless Workflow (repo)

While the serverless working group acts as a focal point for serverless relating CNCF activities, each sub-project is independent and defines their own processes, governance model and work. Periodically, each sub-project should join the serverless working group's weekly call to provide an informational update on any key activities that might be of interest to the broader serverless community.

Additional work streams can be suggested, see the proposal directory's README for more information.

The TOC sponsor of this WG is Ken Owens.

Non-Goals

Identify one serverless project to rule them all.

Communications

The mailing list for e-mail communications:

Send emails to: cncf-wg-serverless
To subscribe see: https://lists.cncf.io/g/cncf-wg-serverless

And a #serverless Slack channel: https://slack.cncf.io/

Landscape

You can open up suggestions and issues with the landscape here: https://github.com/cncf/landscape.

Interactive Landscape

Please see the new interactive version of the landscape. The easy-to-remember URL is s.cncf.io.

Serverless Overview Whitepaper

The current version of the whitepaper can be found here.

Docs

Presentations, notes and other misc shared docs

Meeting Time

See the CNCF public events calendar.

The Serverless Working Group meets every Thursday at 9AM PT (USA Pacific). See the Serverless Working Group Meeting Minutes for dial-in information.

In Person Meetings

None planned at this time.

Meeting Minutes

The minutes from our calls are available here.

Recording from our calls are available here, and older ones are here.

Some of the presentations made during the calls can be found here.

wg-serverless's People

Stargazers

Watchers

Forkers

duglin markpeek krook serverless orainxiong ieb austencollins faas-lane mrutkows markito alexxnica kryndex leecalcote vincentpanqi carimura jasonforbes borerere oorryy sinofseven elharo cptnyesterday paralax dhilip89 auhlig bennage shaunstanislauslau sceik intfrr alexellis spencerx djkpengjun eagle518 philipz haohonglin cm-wada-yusuke alexcontini crazy-ace sreleti liugenping ellerbrock cloud-architecture iacfio mateimicu jamaki anthonyv5 affix vlaaaaaaad yaronha ovace ghosthamlet akoserwal delabassee farck ekamaras sportsbitenews venkyvb vmsearch tunks puresec fuxiocteract animeshtrivedi goposky cathyhongzhang b43646 danieloh30 mupengyun smilingleo vsrp1010 jiangsen vccler windofthesky torresvskaka murindwaz toshi0607 hanfeijp max030 simonaco alfredhuang211 wints nicqiang joveli kangzhenkang jagadeeshvenkatesh tsurdilo kangliqiang salaboy ruromero saravana76 rainbowmango mswiderski servicefoundation refine36 mkrupczak3 falko zhaopufeng manuelstein 0xm0 dbenge stevecrozz chenxian01

wg-serverless's Issues

[workflow] Verify all examples code against the latest JSON Schema

examples.md

[workflow] separating consumed/produced event types

Status: A workflow description has an events property, a set of CloudEvent definitions that can be "consumed or produced". Each event definition has a mandatory source and type in accordance with CloudEvents.

Issue: It is difficult to identify the ones produced and the ones consumed. An engine that acts as the producer would need to ensure that all created events' source+ID pairs are unique, e.g. by putting itself as source. An engine may want to register as consumer of consumed event types only.
Should the spec separate produced/consumed event types?

Also, for the types produced, what would be used as source?
The developer can statically set an absolute URI and would somehow need to ensure all produced events' IDs are unique (UUIDs are typically reliable). Or would the workflow engine want to put its own absolute URI to identify as the source of an event?

AWS support of CloudEvents

Hi. We're working on a bunch of event-related stuff at AWS with a view to releases in 2019. It has been internally proposed that we "Support CloudEvents" and this seems to me like a good idea. No promises, but my opinion is that if we could find a plausible way to do it, there is a high likelihood that we would.

Important This is not an official communication, this is Tim looking for options I can use internally.

Background

We already have an AWS Event envelope that is widely used across AWS chiefly through the "CloudWatch Events" service. There's no formal schema or anything but it's documented here.

The required top-level fields are:

version: (Corresponds to CloudEvents specversion)
detail-type: (type)
source: (source)
id: (id)
time: (time)
detail: (data)

Fields in AWS Events not in CloudEvents: region, account, resources.

Fields in CloudEvents not in AWS Events: datacontenttype (So far we're all-JSON all the time.)

Problems

There are already huge numbers of AWS Events per second flowing through our system and there's no API, the only documentation is of the bits on the wire. And, there are a huge number of AWS customers who are making use of this service. Therefore there is no reasonable prospect of us changing field names.
CloudEvents isn't finished, but we want to ship production software in 2019. In theory, the working group could do a wholesale redesign of the attributes at any time. Can any organization plausibly claim to "Support CloudEvents" at this stage in history? It's not obvious to me how.

Appeal for input

Before I dive any deeper, I'd welcome general comments along the lines of "No, just wait till CloudEvents is finished", or "That is a really bad idea because X" for some value of X, or "You can effectively support CloudEvents by doing A, B, and C." Thanks in advance.

Some notes about possibilities

CloudEvents could consider freezing the field names. I.e. adopt some sort of policy that whereas new attributes may be added, there's a commitment not to change the names of the ones that exist.
1. And if you wanted to try making AWS an offer it couldn't refuse, consider changing the field names match ours. We could add support for the extra CloudEvents fields and our extras could be extensions. There are some semantic incompatibilities (e.g. our "source" field looks like a relative URI reference) but probably there are workarounds. (OK, long shot).
Make sure all the CloudEvents SDKs include getters (I don't think setters?) for all the specified attributes. That way, if someone like us has CloudEvents-like constructs but with incompatible names, we could implement the SDK and then claim we're compatible because you can always use the CloudEvents SDK to access the attributes. Is this sane?
At the very least we could ship libraries that transcode between AWS Events and CloudEvents. Would that be enough to make users happy?

[workflow] Verify all relative+absolute links in community docs

https://github.com/cncf/wg-serverless/tree/master/workflow/spec/community

[workflow] Check/Verify relative + absolute links in specification docs

[workflow] Terminology, Scope and Specification Core simplification

As stated here: #127 one of the main challenges for people looking at the Workflow Spec inside the CNCF Serverless WG is around the scope of the Workflow Sub Working Group and terminology used in the Spec.

This issue proposes a set of changes to the terminology and the scope of the Specification to help to clarify scope, and purpose for the language itself.

Terminology change proposal

Moving away from State Machine/Automata terminology (Finite-state machine - Wikipedia) will bring clarity and set the right expectation for the users of the language:

State Machines terminology creates certain expectations around formality that this specification is not trying to cover. This is causing confusion, as the terminology is mixed all over the place where the main construct is called “State”.
Using workflow terminology such as “Task” will clearly specify what is expected for each element inside the workflow definition. This means that a Workflow is in charge of coordinating a set of tasks in a certain order.
- A Task represents a unit of work at runtime.
- Tasks can be specialized with different runtime behaviors such as
  - Operation Task: Logically encapsulate one or more Function calls
  - Fork/Join Task: Choose between different paths of the Workflow
  - SubFlow: Initiates a new instance of a Workflow
  - Event Consumer: Wait for a certain type of event
  - Event Producer: Produce a certain type of event
  - Each of the previous subtypes bringing a different runtime behavior
- Tasks must contain the minimal information needed to identify them such as ID, Name, Type. Then subtypes can extend this information as needed.
- Cloud Events must be first-class citizens inside the spec, meaning that Tasks and Cloud Events relationships need to be clearly defined
  - A Task for Event Emitting/Consuming can be clearly defined using Cloud Events definition references
- Transitions between Tasks and Events to Tasks need to be clearly specified, the term transition is well understood and it should be used as it is currently defined.

The language itself doesn't specify how a Workflow will execute these tasks, that is left for each implementation to decide.

Simplification of concepts at the core of the specification proposal

As part of the terminology change, we need to make sure that the terms are not overloaded to an extent where the spec is confusing. This will require changes in the description of the Tasks (previously States) to reduce their responsibility to the minimum.
As examples for these simplifications, we can start with core constructs such as:

Operation Task: An Operation Task logically encapsulates one or more functions calls. The Operation Task assumes successful functions calls and when all the specified functions are called correctly, the task is finished. As previously defined, the user can fine-tune how these functions are called, in sequence or in parallel.
Event Consumer Task: An Event Consumer task waits for one or more events with the specified event type and filters. Event consumers can specify filters that can use Events Metadata to automatically discard events. If more than one event is specified, the Event Consumer Task will wait for all the specified events, when all of them are present the Task will finish.
Fork/Join Task: Fork/Join Tasks are used for flow control. Depending on the type, the Fork Task can fork a current execution into two or more paths of execution and the Join Task joins multiple paths into a single one.

Data Flow considerations and scope

Data Flow and Data evaluations and expressions inside workflows should be kept separate from orchestration as much as possible. Complex data handling should be kept as a sub specification. This will help us to keep the orchestration language simple. Having said this, we need to make sure that the workflow language is extensible enough to support Complex Data Flow extensions in the future.

As proposed in the spec, Workflow Instances can work with a JSON payload, where tasks can add and change data, that can be used to call consecutive functions down the line and to keep the instance contextual data. This context is shared and accessible for all the tasks in the workflow. Expressions can use the data inside this JSON payload to make decisions based on the available values.

For the sake of simplicity, Tasks does not impose inputs and outputs at the CNCF Workflow Language level, once again this can be incorporated later as an appendix to the spec.

Pull Request

I am happy to provide a Pull Request with the proposed changes here, but even sending the PR I would love to get feedback from all the parties involved to make sure that we are all on the same page.
After checking with different people, I believe that these proposed changes will bring clarity and provide a consistent core that we can mature with time, without confusing newcomers that find the current state of the spec to complex and confusing.

@tsurdilo @cathyhongzhang @mswiderski @berndruecker @manuelstein Feedback is highly appreciated.

[workflow] Spec contents that are difficult to understand

I'm having multiple issues with the spec, but maybe that's just me, so this is an attempt get it sorted out and I'd be happy if you could help me.

Why can Delay state types loop?
Why does Delay state type have error handling when there's no action attached (e.g. Relay state type can't)?
How can the Switch state type loop when its choices might have transitioned the workflow to another state already?

If my understanding of the correlationToken is correct, then a workflow can receive multiple events through separate sources with the same token, meaning that they all belong to the same workflow activation. IIUC, the first time a token with content "session1" appears, the workflow would start at the startsAt state and subsequent events with the same token "session1" can be consumed by an Event-type state if there is any.

What if the workflow consuming "session1" tokens has not yet reached the Event-type state that the event is targeted at? (would they need queuing?)
Why does an Event state contain more than one event definition? This could be a Switch state evaluation that has multiple choices for the received event
Why is there a list of actions plus an actionMode that defines if the actions are run in parallel when there is also the Parallel state type?
What does it mean if an Event-type state loops - would it transition to another state as defined in its EventDef or consume more events?
What if I have parallel branches waiting for external events? Do they consume events or would non-matching events be put back in the queue?

I also noticed the Event state type condition is a string, the Switch state type choices are JSON encoded logic, but transitions and error handling use a separate expression language.

I'd suggest to

reduce the Event state to just wait for an external event and transitions
unify the conditions
consider replacing Switch with just a set of conditional transitions
replace error handling with just evaluating whatever the action output (i.e. not having the engine detect if the result of an action is a stack trace due to aborted execution or a valid result)
remove loop from attributes and make it its own construct, maybe merge it with subflow

The step-by-step execution is nice, but maybe the control logic is really independent from triggering functions, e.g. something that allows a more flexible use of control elements - like sleeps, loops, conditions and outside interaction (event send/receive).

.

Teat

Workstream Proposal: Function Signatures

There are multiple providers that have different ways to handle functions. Using multiple providers, switching providers and developing functions would be significantly better experiences if there was a common structure for function signatures.

Some examples can be seen in function-signatures-examples.md but the same issues are present in other supported languages too.

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Workflow link in readme.md broken

This work is being done in the Workflow directory.
Link returns 404...

Add Riff to Serverless Landscape

https://projectriff.io

[workflow] Event state simplification

Event states can just be the entrypoint for a workflow. There is no need to define any actions and just define the next-state which might be an OPERATION or something else.

e.g.

    {
      "name": "test-state",
      "events": [
        {
          "event-expression": "name eq event1",
          "timeout": "10 min",
          "next-state": "nextStateForEvent1"
        },
        {
          "event-expression": "name eq event2",
          "timeout": "10 min",
          "next-state": "nextStateForEvent2"
        }
      ],
      "type": "EVENT",
      "start": true
    },

Please add Wiki to Serverless Workflow project

Workstream Proposal: APIs for accessing CloudEvents

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Pullman

Workstream Proposal: Workflows / Function Composition

Many serverless applications are not a simple function triggered by a single event, instead they are composed of a function workflow/graph with events and functions interleaved together.

A user needs a standard way to specify their serverless use case workflow. For example, one use case could be "do image enhancement and then face recognition on a photo when a photo is uploaded onto the cloud storage (photo storage event happens)." Another IoT use case could be “do motion analysis” when a motion detection event is received, then depending on the result of the analysis function, either “trigger the house alarm plus call to the police department” or just “send the motion image to the house owner.”

A user’s workflow may involve both events and functions. For example, in a workflow, the user can specify what combination of events trigger what functions, those functions are executed in sequence or in parallel, what information is passed from one function to the next function, whether the next step function execution needs to wait for another event to happen.

Some information discussed in the CloudEvents, such as the correlation id, is associated with a usecase workflow and do need to be specified in the workflow specification. While we work on workflow specification, we might find out that some attributes are missing in the CloudEvents.

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Formalize the charter for the Serverless WG

We have a "charter-esque" like listing here but I think we should formalize not just out goals but our operations, how we vote, etc.

I am happy to take the charge. Creating this issue for tracking purposes.

Investigate Serverless Application Model (SAM)

https://github.com/awslabs/serverless-application-model

Clarification on supporting multiple actions and actionMode in Operation State

I have this issue opened and would like to discuss the spec regarding the clarification on supporting multiple actions and actionMode in Operation state.

If my understanding is correct, in the specifications, the Operation State can contain multiple actions, and each action contains one function or one subflow. However, this approach cannot flexibly specify onErrors definition such as a retry policy for each action.

Furthermore, I looked a closed issue which has similar topic discussed before 140 . @tsurdilo @manuelstein @cathyhongzhang Thanks for the input of that issue, it was helpful for me to clarify some things. :)

However, I am still trying to clarify what are the use cases of using multiple actions in one Operation State vs simply using multiple Operation States which gain the flexibility of specifying onError definition for each function. When I looked at AWS Step Function which has similar concept called "Task" (I think it is quite equivalent to Operation) can only contain one function where user can specify retry definition for each function.

Therefore, it may be confusing to new users who are not fully aware of pros and cons about using multiple actions which lose the flexibility of specifying retry policy for each action in Operation State.

It could be just me having this question lol.. I would appreciate if you could please help me out. :)

Workstream Proposal: Common function logging, observing, and monitoring

Functions generate logs which are stored in the underline platform (e.g. Kubernetes logs, AWS Cloud watch, Azure App insight, elastic search..). each serverless platform has its own way of writing to a log. If we had a common way/api for logging it could have made functions portable AND allow simple integration between log services and function platform providers.

See proposal details in function logging

In addition to logs standardizing and integrating other APIs for custom metrics counting and tracing can simplify developer work.

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

[workflow] Proposal: Confusing Terms and Scope of Workflow WG

After checking with different members of the Serverless WG in San Diego, Kubecon and online, it is clear that there are two confusing aspects around this Specification:

Scope of the group

Looking at other initiatives, we need to make sure that the scope of the Workflow sub group is as scoped as possible to make sure that we can incrementally deliver value to our intended audience.

The current focus until now has been the definition of a new workflow language to avoid provider lock-in. While this is an important effort, we also need to cover how these workflow definitions will be referenced in the infrastructure, and following how Cloud Events have currently defined this group will benefit from having well-defined types for workflows to be understood by other components. I will be working on a PR for a proposal around this as well as a more scoped definition about the subgroup goals and objectives.

Terms and Nomenclature for the Workflow language

Currently, this Specification uses a State Machine nomenclature to specify how a workflow can be constructed, where the main concept used is "states". While this is fine for certain scenarios, using States' nomenclature can lead to wrong assumptions related to more theoretical frameworks such as automata and FSMs (https://en.wikipedia.org/wiki/Finite-state_machine) where certain formalisms need to be applied, which will not apply to all use cases of this specification.
I will be working on a proposal for the group to decide on these new terms to make sure that the specification is as clear as possible and doesn't lead to wrong or complex assumptions about the intentions for the language.

@tsurdilo @cathyhongzhang @mswiderski @berndruecker I will create two PRs, but I wanted to share this before to see if the group understands about the motivation behind these proposals and clarify doubts to everyone else that wants to get involved.

add Algorithmia for Serverless AI?

https://algorithmia.com/serverless-ai-layer

restructure the event state and operation state

Here is my thought to address the confusion on the event state:
We can do the following to make it more clear and more consistent with the definition of other states, especially consistency with the Operation state

rename the "events" field to "eventAction" since the definition of "events" includes not only event triggers, but also actions and action mode association with the event triggers, and the transition to next state. Suggestion for a better name is welcomed.
Add "operationAction" to operation state and move "actionMode" and "actions" inside "operationAction". This will make the first level fields of all the states be more consistent.
Modify the "condition" definition inside existing "events" as follows:
"Condition consisting of Boolean operation of events that will trigger the actions"
I guess existing description that says "Condition consisting of Boolean operation of events that will trigger the event state" could cause confusion. If the event state is not the start state, the previous state transition triggers transition to the event state, not the events.

wdyt?

Restrictions on defining multiple event triggers

This is related to pr #153
This issue tackles one particular problem with current specification:

Restrictions on defining multiple event triggers

Let's first define "event" (as in an event-based architecture)
An event is a fact that present a few distinguishing characteristics:

Usually immutable
Has temporal constraints: usually you need to correlate multiple events, specifically temporal
correlations where events are said to happen at some point in time relative to other events.
Have managed lifecycle: due to their immutable nature and the temporal constraints, events usually will only match other events during a limited window of time

If you look at issue #155 there we show some examples of event expressions inside event states.

As per comments on pr #153 we should not allow temporal constraints on events and should only allow boolean expressions that are not concerned with them. This means that the workflow specification will not provide means to define event relationships in regards to time (which is at the core of their definition).

To put it into perspective, workflow specification could model a use case (as mentioned in comments of the pr):

The patient is being monitored for multiple physical information. Each monitoring device produces an event and there are multiple monitoring devices. If any one of the events exceeds some threshold, an alert should be raised.

but could not model some more realistic real-world use cases such as:

The patient is being monitored for multiple physical information. If a patient is having a Heart Attack, immediately call the doctors. If the patient is having a Mild Fever wait 5 minutes and if the temperature is still high call the nurse

Where Heart Attach and Mild Fever events might be deducted from several monitoring events and using temporal constraints of the events produced.

Event relations such "after", "before", "coincides", "during", "finishes", "finished by", "includes", "overlaps", etc etc, relationships that actually allow you to solve real-life business problems in event-based architectures are simply said as "not allowed".

This is a big limitations as to what serverless workflows can or cannot do. Inability to tackle orchestrating real-live problems one can say makes this specification useless.

Proposed solution: pr #153 :

it is not in the scope of serverless workflow spec to be a complex event processing service/engine, but still the capability to make decisions based on event temporal constraints is needed to orchestrate real-live business problems.
instead of dealing with low-level events which by themselves have no meaning or only allow simple use case (such as first one above) serverless workflow defines to consume high-level or "orchestration events" such as "Heart Attack" or "Mild Fever".
CEP services or Queue based systems provided by cloud providers would need to be responsible to define the complex temporal event rules and produce their corresponding orchestration events.
Unions and Intersections of these orchestration events should be done with control flow logic rather than string expressions. This allows for a portable implementation of those as well as their visual representation.
For example and "or" relationship using parallel state:

or "and" with simple transitions:

This is implemented and documented with examples in #153.

Unmanageable event expressions in event states

This is related to pr #153
This issue tackles one particular problem with current specification and lets call it

Unmanageable event expressions in event states

First lets look at parameters in a Cloud Event (reason being that our workflow spec is said to "consume" CE so our expressions must match against the properties defined):

id
source
specversion
type
datacontenttype
dataschema
subject
time
data
data_base64
(https://github.com/cloudevents/spec/blob/master/spec.json)

In order for a workflow then to say that event X might trigger some actions to be executed, it may have to match against these parameters. It may match against one, or all of them if needed.

Lets now take a look at the event state as it is currently defined in our specification, here is an example:

"states":[  
  {  
     "name":"Sample Event State",
     "type":"EVENT",
     "eventsActions": [
      {
         "expression": {
           "language": "THE EXPRESSION LANGUAGE USED",
           "body": "THIS IS THE EVENT EXPRESSION HERE!!!!!"
         },
         "actions":[  
            {  
               .... 
            }
         ]
     }],
     .....
  }
]

Event states can have one or more "eventsActions" elements each one containing an expression which needs to match events.
The definition of this "expression" is: Boolean expression which consists of one or more Event operands and the Boolean operators.

What are these "event operands" - that is not defined but in order to write an expression there has to be some reference to "$event1", "$event2", ......

Now let's attempt to write an expression, in our example lets say we are concerned only about one single CE:

{
    "specversion" : "1.0",
    "type" : "com.github.pull.create",
    "source" : "https://github.com/cloudevents/spec/pull",
    "subject" : "123",
    "id" : "A234-1234-1234",
    "time" : "2018-04-05T17:31:00Z",
    "comexampleextension1" : "value",
    "comexampleothervalue" : 5,
    "datacontenttype" : "text/xml",
    "data" : "<much wow=\"xml\"/>"
}

using lets say the popular Spring expression language and again assuming that an impl gives us reference to "$event"
we could write (now we always want to write "worst-case scenarion where we have to match on all properties so we can see what users might be having to do also):

Now lets look how this expression would look like when we have multiple events involved. This like whats mentioned in comments of pr #153 many times, like "event1 and event2 and event3 and event4" or something:

"$event.specversion eq '1.0' and $event.type eq 'com.github.pull.create' and $event.source eq 'https://github.com/cloudevents/spec/pull' and $event.subject eq '123' and $event.datacontenttype eq 'text/xml' $event2.specversion eq '1.0' and $event2.type eq 'com.github.pull.create' and $event2.source eq 'https://github.com/cloudevents/spec/pull' and $event2.subject eq '123' and $event2.datacontenttype eq 'text/xml',$event3.specversion eq '1.0' and $event3.type eq 'com.github.pull.create' and $event3.source eq 'https://github.com/cloudevents/spec/pull' and $event3.subject eq '123' and $event3.datacontenttype eq 'text/xml',$event4.specversion eq '1.0' and $event4.type eq 'com.github.pull.create' and $event4.source eq 'https://github.com/cloudevents/spec/pull' and $event4.subject eq '123' and $event.datacontenttype eq 'text/xml'"

Some questions:

does this look maintainable?
portable?
does this look like something we can test or debug or even read?
in example im not even using things like time parameter of the CE, so additional date based checks can be done making expressions even more confusing
Imagine you are trying to match against 10 or 30 events :)
Proposed solution
In PR #153 the proposed solution is rather than using expressions to match an event, let's actually make use of the "events" property of the workflow and offload the defining of these expressions to the implementation, however still be very specific, and portable withe defining exact matching if needed, here is an example:

"events": [
 {
  "name": "GreetingEvent",
  "type": "greetingEventType",
  "source": "greetingEventSource"
 }
],

"states":[  
  {  
     "name":"Greet",
     "type":"EVENT",
     "eventsActions": [{
         "eventRef": {
            "name": "GreetingEvent"
         },
         "actions":[  
            {  
               ...
            }
         ]
     }]

In the "events" property we define the CE with already pre-defined parameters.
Instead of using a boolean expression, we use "eventRef" which is a reference to a defined event.
The implementations have to then make sure that CE that will trigger actions matches in all the defined parameters of the "GreetingEvent".

This is implemented and documented with examples in #153.

[workflow] Proposal: Serverless Workflow WG Scope

As presented in San Diego, the scope of the group has been the definition of a vendor-neutral Workflow Language for Serverless Applications.

As stated in the main README file of the group:

“The goal of the Serverless Workflow sub-group is to come up with a standard way for users to specify their serverless application workflow, as well as help, facilitate portability of serverless applications across different vendor platforms.
Serverless Workflow is a vendor-neutral and portable specification that meets these goals.”

While this is a big effort in itself, having just a language will not promote cloud providers to implement it, we need to define how the language and the concepts defined in the group will be mapped to actual implementations in the existing CNCF ecosystem.

As stated in workflow Proposal: Confusing Terms and Scope of Workflow WG · Issue #127 · cncf/wg-serverless · GitHub, the lack of clear scope and how the current language will be handled by different vendors is causing confusion, hence this proposal.

Proposal

The scope of the group should be changed to cover two main angles:

Workflow Language (well-scoped for the first iteration)
How cloud providers can integrate with Workflows, meaning which standard APIs/Contracts should Cloud Providers implement in order to integrate with workflows.

By covering these two points we will promote (as a group) the use of the vendor-neutral language as well as how cloud providers can understand these workflows.

In a Kubernetes world, the integration with the ecosystem can be done by defining a new concept (CRD) “Workflow” that Kubernetes can understand and then manage and monitor the lifecycle of each Workflow.

By providing this new concept, each cloud provider can understand that there is a new Workflow available and then do whatever is necessary to deploy, execute and monitor these Workflows.
The “Workflow” concept itself can include a reference to the workflow definition and metadata that can be used as an envelope for the definition itself.

The “Workflow” concept becomes part of a Standard API which is vendor-neutral as well as the Workflow Definition Language.

An initial proposal is provided in this PR:
#131

Feedback is more than welcome.

Workstream Proposal: Event Orchestration / Chaining

The way events are orchestrated and chaining functions were deemed to be not in scope for CloudEvents, but maybe in scope for the Serverless working group.

There are issues such as event history/ chaining, event nesting that do need this to be defined. There are increasing number of question regarding who is allowed to modify a CloudEvent, how an event is forwarded and so on.

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Fix image used for the Fission FaaS platform

The landscape contains a logo for "platform9" under "platform > kubernetes-native". However, Platform9 is the company contributing to the FaaS platform Fission. So it might be better to change the logo to one of Fission. For example:

Or see: http://fission.io or https://github.com/fission/fission

[workflow] add something to the governance doc about how to become a maintainer

The docs at: https://github.com/cncf/wg-serverless/tree/master/workflow/spec/governance are a good start but now let's add something in there about how someone can become a "maintainer". People will want to see a clear path to a leadership role.

License

Please add a license to the repo

[workflow] Defining multiple events in eventsActions expression and state data integrity

This is related to pr #153
This issue tackles one particular problem with current specification:

Defining multiple events in eventsActions expression and state data integrity

Currently our specification defines that in an Event state the payload of events that are expressed in the event expression and trigger actions is merged with the states data. To visualize:

where the resulting data after the optional event data filter is merged with the state data.
When using multiple events lets say A,B,C,D,E merging of each events payload into the state data can create many issues and ultimately compromising workflow orchestration decisions. To still do this, we must define possible non-portable merging strategies via extension for example.
Also, "eventsActions" definition currently define a single event data filter which is optimal for a single event but not in the case of multiple.

Proposed solution: pr #153 : As mentioned in issue #156:
Unions and Intersections of events should be done with control flow logic rather than string expressions. This allows for a portable implementation of those as well as their visual representation
In addition to that to deal with this particular problem we need to allow only a single orchestration event to be defined per eventsActions definition, for example:

"events": [
 {
  "name": "Event1",
  "type": "event1Type",
  "source": "event1Source"
 },
{
  "name": "Event2",
  "type": "event2Type",
  "source": "event2Source"
 }
],
"states":[  
  {  
     "name":"Event State",
     "type":"EVENT",
     "eventsActions": [{
         "eventRef": {
            "name": "Event1"
         },
         "actions":[  
            {  
               ...
            }
         ]
     },
{
         "eventRef": {
            "name": "Event2"
         },
         "actions":[  
            {  
               ...
            }
         ]
     }
]
...
  }
]

This allows for having to merge a single event payload into state data and use the defined event data filter for it.

Transition precedence with the retry definition

Transitions are a great way to specify that the workflow should switch to another state. But I think there's a problem with the retry definition allowing to define transitions.

Currently, transitions can be specified

in onError behaviour of (1) workflow runtime errors, (2) delay- (3) event- (4) operation- (5) parallel- (6) switch- (7) subflow- and (8) foreach-state
in eventactions to evaluate at the end of a sequence or parallel set of actions as part of the event state
in states to define where next to transition to (delay, operation, parallel, subflow, relay, foreach and switch state) and in switch not only as default but also for single- and- or- and not-choices

and also:

in retry behaviour of a single action

I'm especially concerned with the latter in concurrent actions. Any parallel state or an event state that uses parallel mode IIUC may contain such an action retry definition, that, when the maximum number of retries has been exceeded, transitions out of the branch into a state that, e.g. halts the workflow. When the retry policy fires, i.e. it says that it is time to transition, the parallel state could no longer wait until completion of all the parallel actions. It is undefined what happens to the ongoing concurrent executions.

Unlike the retry (that is attached to an action) there is no problem with onError, because error definitions are only used for the outcome of the entire state or for overall workflow runtime errors. But retry is tied to an action, which can happen anywhere during a more complex state execution.

I see a several ways to resolve this:

specify that an overrun of attempts is an error, i.e. halt or conclude executions within the state to ensure completion and then trigger the state's onError behaviour
specify that in case of exceeding maxAttempts, the surrounding state would still conclude (join parallel executions) but that the retry transition overwrites any other evaluation, i.e. a preceding retry transition overrules any other outcome of the state.
let the parallel branch enter that state and be generally aware of branching in the workflow

For a clean workflow state handling, I'd prefer 1. because exceeding retries to me are an error and circumventing the state's onError definition with an in-place retry transition can be a bit confusing.
I'm tempted towards 3., i.e. to allow branching/concurrent paths in the workflow. As of now, concurrency can only happen within a state. Even multiple arriving events that correlate to the same workflow instance can not preempt/interrupt the flow but would be buffered and hence may actually never be consumed. At least I think the current positions in this group are not to allow a workflow to take concurrent paths. Branching out and aggregating branches at a later point would be a bit more powerful, but can also make it a lot more complex.

On a related note, I'd like to point out the choice state; by using a fitting expressionLanguage transitions allow similar matching of event data and can achieve the same boolean logic and value comparisons that are currently defined as choices, which sort of makes the choice state superfluous.

[workflow] Define event-expression format

We need to define how the event-expression must be expressed. Trying to keep it language-agnostic.

Proposal:

Comparison

eq: equals
ne: not equals
empty: If the value is empty or doesn't exist

Logical

and: And operator
or: Or operator
not: Not operator

Examples

name eq 'mytrigger'
type eq 'sometype' or type ne 'uglytype'
not empty(correlation_token)

Is cron like scheduling possible?

I've been reviewing the spec and it seems like you can define the state/task start kind to be scheduled but this only takes an interval as a property. Is it possible to specify a cron like syntax?

Workstream Proposal: Common Serverless Benchmark framework

Users often try to compare Serverless frameworks on performance, which is faster and in what use-case, rather than having each vendor define their own benchmark which may be biased towards their own implementation it would be great to have a common standard like SPECvirt or YCSB (NoSQL Benchmark).

Performance benchmarks may include aspects of throughput, latency, scalability, cost/performance, cold/warm start, etc. There may be various use cases with different performance behaviors such as small HTTP requests, stream processing, image processing (each may have different bottleneck between network, data, CPU).

Nuclio team made a small step in that direction with simple request latency & throughput benchmark which can be used to benchmark various serverless platforms see the link

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Remove open-events folder from proposals

Since we just renamed open-events to cloud events should we remove the folder below from the proposals folder to clean up ?

https://github.com/cncf/wg-serverless/tree/master/proposals/open-events

[workflow] Spell check specification documents

Workstream Proposal: CE Client SDK

See: cloudevents/spec#205

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

Workstream Proposal: Common function model

Each platform today has its own function spec file/API which describe the desired function resources, environment variables, triggers, etc e.g. nuclio function spec doc. This means deploying function on a new platform require adapting your deployment scripts or logic every time you shift providers.

It is even a greater burden in many cases since the function configuration may depend on external resources such as databases, API gateways, message queues, etc.

some efforts like AWS SAML or Serverless.com tried to deliver higher level abstraction which has a potential of delivering a common/cross-platform model, yet the such efforts may require participation from the platform providers and agreement on such a common model.

Please discuss, then vote with a 👍 if you would like this to be the next workstream item.

cncf / wg-serverless Goto Github PK

wg-serverless's Introduction

CNCF Serverless WG

Non-Goals

Communications

Landscape

Interactive Landscape

Serverless Overview Whitepaper

Docs

Meeting Time

In Person Meetings

Meeting Minutes

wg-serverless's People

Stargazers

Watchers

Forkers

wg-serverless's Issues

Background

Problems

Appeal for input

Some notes about possibilities

Terminology change proposal

Simplification of concepts at the core of the specification proposal

Data Flow considerations and scope

Pull Request

Scope of the group

Terms and Nomenclature for the Workflow language

Proposal

Proposal:

Comparison

Logical

Examples

Recommend Projects

Recommend Topics

Recommend Org