Code Monkey home page Code Monkey logo

Comments (8)

jbbarth avatar jbbarth commented on July 18, 2024 1

@tkornai @frncmx : just FYI #110 has been merged with other improvements and we're rolling this out in production this week at Botify. We're pretty confident now that the process management works reasonably well. There might be bugs or corner cases, we'll see with time :-)

As for the initial request of this issue, we now expose workflow informations, including the tags of the executions, see #177 which has been released with simpleflow 0.12.6. The examples/basic.py file shows how it works, here's the output of the debug print:

% simpleflow standalone --nb-deciders 1 --nb-workers 1 --heartbeat 5 --tags tag1,tag2,tag3 examples.basic.BasicWorkflow --input '[1]'
...
DEBUG: execution context: {'run_id': u'22H0aD/pVKD/k9iZWnriGk/YR/6jfVbl6M/vww8fUF96k=', 'workflow_id': u'basic', 'tag_list': [u'tag1', u'tag2', u'tag3'], 'version': u'example', 'name': u'basic'}
...

I close this issue. Thanks for suggesting this!

from simpleflow.

jbbarth avatar jbbarth commented on July 18, 2024

Well unfortunately, simpleflow doesn't expose that information at the moment, but that's definitely something which should be possible regarding SWF design.

What a decider does is mainly calling for PollForDecisionTask, which returns all the events in the workflow, then it runs the workflow code and returns a set of decisions to SWF via RespondDecisionTaskCompleted.

At the simpleflow level, the interesting bit happens here: https://github.com/botify-labs/simpleflow/blob/master/simpleflow/swf/executor.py#L449-L458 :

  • line 449: we parse the events, and especially the first event which contains the workflow input (and the tag list you want to use)
  • line 458: we execute the "client" workflow code after parsing the "input"

At this moment we lose the tag information and your workflow doesn't have access to it directly. Do you think at this point we should populate a specific "kwarg" with the remaining informations from the workflow start (like its events, workflow id, run id, ...) ? The problem is that it would be a real breaking change that needs to be taken into account in workflow definitions.

Today we have a similar use case at Botify, and we simply pass the information via tags AND via some kwargs in the workflow input. So we issue start_workflow_execution(..., tag_list="foo=bar", input=json.dumps({"foo": "bar", ...})). That leads to some duplication, but tags are primarily here for executions filtering, and we consider the input to be the single source of truth for the workflow execution.

Another method (maybe ugly) would be to make direct boto calls in the workflow execution, or better to have a first task in the workflow that does this and returns your tag list as a result.

Anyway, I confirm that it's not easy as is with the current version of simpleflow. What do you think?

from simpleflow.

frncmx avatar frncmx commented on July 18, 2024

1st of all thank you for your quick and very detailed answer.

tags AND kwargs

We restrict SWF API for security reasons. If we cannot check tag_list from the application code, then the following could happen.

  • An attacker provides the tag to be able to start the execution, but does not provide the same tag in the kwargs.
  • That would mean he/she able to start an arbitrary workflow, which might affect not just test services, but production services as well.

Here is the IAM policy, which makes these restrictions possible. (Just to give you little more context.)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "swf:StartWorkflowExecution",
      "Resource": "arn:aws:swf:${var.region}:${var.root-account}:/domain/${var.domain}-*",
      "Condition" : {
        "StringEquals" : {
          "swf:workflowType.name" : "${var.restricted-workflow}",
          "swf:tagList.member.0" : "${var.test-tag}"
        },
        "Null" : { "swf:tagList.member.1" : "true" }
      }
    }
  ]
} 

We could use a regexp with StringLike filter on the input field to make sure the required tag is there, but I think that would be quite cumbersome.

Conclusion: I feel this workaround is not really an option for us.

Separate activity task to return the tags for the current execution

I do not feel this is like an optimal solution, but could work I think. However I don't know how to access the tags from a task yet.

  • Could I use your library's code to construct an object and access the tags through it?
  • If I need to make a low level boto call then I guess I'll need some kind of IDs (run_id, workflow_id) to identify the workflow. Is all the necessary information currently available in activity tasks?

Breaking change

You use semantic versioning for releases. The software is under v1.0. => I think is OK to have a breaking change, rather than having workarounds in the later versions of the software.

If you really want to avoid the breaking change and I understand the code well, then you might want to pass the tag_list as an instance variable to self._workflow (where self is the executor parsing the history).

Our just add tag_list to **kwargs as a key after you parsed the input field.
(https://github.com/botify-labs/simpleflow/blob/master/simpleflow/swf/executor.py#L449-L458)
That does not feel as a clean solution, since kwargs is explicitly defined in the input JSON.

from simpleflow.

jbbarth avatar jbbarth commented on July 18, 2024

tags AND kwargs

Wow! We don't have such advanced IAM policies, but I totally understand your concerns given those informations. OK!

Separate activity task to return the tags for the current execution

We're hitting the same kind of problem, sorry. PollForActivityTask exposes the workflowId and runId parameters to the activity worker, but as far as I know they're not passed to the activity task.

Breaking change

Yes about semantic versioning, but that's more a facade for the outside world :-). At Botify we do use simpleflow in production for 2 years (in fact some parts of it: not process management nor workers, but we use the decider part basically), so if I change signatures I do have to coordinate a change in a dozen of internal workflows. That's definitely doable though!

Extending kwargs is not possible (because they're passed to our workflows' run() method, which in most cases have a strict signature like def run(pos1, pos2, kwarg1=None) so adding a new kwarg would raise a TypeError. But adding attributes to the workflow itself looks promising, definitely worth exploring!

Another note

As you seem to explore simpleflow limits theses days, note that:

  • simpleflow decider logic and the whole swf.* namespace are pretty mature when you have simple workflows ; we do use them in production for 2 years, definitely not perfect but still, it's OK ; you ran into a specific problem where you want to use tags inside the workflow, we'll try to address that
  • BUT simpleflow command line wrappers and workers are not ready for production ; I'm working on that right now and it should be fixed by the end of September (see #110 for more details) ; for instance the heartbeater doesn't work in current master, and I'm not sure you can stop your workers correctly for upgrading them

I'm definitely happy and open to any help / bug report you can provide, but I just wanted to warn you so you don't get angry at simpleflow trying to use code that's not stable enough for now.

I'll try to explore how to expose tags with your idea and ping you back! Thanks again!

from simpleflow.

frncmx avatar frncmx commented on July 18, 2024

Breaking change

My only worry with changing the run_execution() function's signature to add tag_list there; I'm not sure if later we do not discover something missing. One example could be --task-priority.

Thank you for your warning! We started to explore simpleflow like 2 weeks ago.
The response time we get from you is pretty amazing, so we are not too worried for now. However, that's true we already discovered an important missing functionality.

from simpleflow.

tkornai avatar tkornai commented on July 18, 2024

Hi Jean-Baptiste,

I'm a colleague of @frncmx and we are working on the same project involving simpleflow. I'd like to ask a few clarification questions so we can have a better understanding of what do you mean when you say that workers are not yet production ready.

Our main driver of using your tool is that our activities (i.e. workflow steps) are already implemented in Python. Can I ask how you run activity tasks if not by simpleflow wrappers?

If we start up our activity workers with simpleflow and they receive activity tasks we see such error messages:
2016-08-31T14:57:16 ERROR [process=Process-1:4, pid=59]: error "Unable to complete activity task with token: ... This very well might be a bug in our own code - I just wanted to clarify if this is an issue that you are familiar with? Is it related to the process management issue you are tackling in #110? If not, what sort of misbehavior can we expect from using simpleflow wrapped activity workers?

We really appreciate your help!

Thanks,
Tamas

from simpleflow.

jbbarth avatar jbbarth commented on July 18, 2024

First the error:

2016-08-31T14:57:16 ERROR [process=Process-1:4, pid=59]: error "Unable to complete activity task with token: ...

It's generally the sign that the activity cannot be completed anymore (unless the token is wrong, but I'm pretty confident this part of simpleflow works correctly, and it's automatic, you cannot do something wrong on that side). From my experience it happens if 1/ the activity has timeout, or 2/ if the workflow has timeout or failed or completed (e.g. it's not "open" anymore). You need to check both possibilities, for instance via the SWF web console.

If it's an activity timeout, you need to find which "timeout type" has triggered. If it's a "heartbeat" timeout, you may have experienced one problem I detail below. Else, let me know.


When I say that simpleflow process management is not production ready, it's mostly because it has 2 annoying bugs that will bite you if you're heavy users like us:

  • the "heartbeat" mechanism in the activity worker doesn't work (already fixed in #110) ; which means that if you set up a "heartbeart timeout" for an activity task, and the duration of this task exceeds this timeout, SWF will mark the activity has having timed out. The heartbeat mechanism is totally optional, but it's vital for us because we run workflows on un-reliable machines on AWS cloud
  • the processes don't stop properly when you send them a SIGTERM ; if you're just starting to use SWF that may not be a problem, but for us it's a no-go as we may have hundred of concurrent workflow executions and we still want to be able to shutdown everything to let a new code version handle everything

That's pretty much it, but you may find other minor problems like the one @frncmx reported about the --tags option.

The good news is I'm nearly full-time on this on September, so that should improve soon.

from simpleflow.

tkornai avatar tkornai commented on July 18, 2024

Thanks for the detailed answer. Regarding the error: we believe it was an issue in our workflow definition, we failed to properly wait for one of the tasks. Luckily the non properly stopped subprocesses won't really hurt us as we are running the activity workers in docker containers and deploys are restarting the containers. Thanks for pointing out that heartbeats are not yet working, we will only enable them once they are supported. Do you have an ETA when can we expect #110 to get merged?

from simpleflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.