Code Monkey home page Code Monkey logo

Comments (8)

sargun avatar sargun commented on May 30, 2024 1

I think if you natively implement the API, and you're a stateless microservice in something like Dropwizard, or Spring boot, your web services API probably gives you the right tools to be able to do concurrency limiting.

Ah, I guess I didn't understand that you wanted a specific implementation of the use case. I'll reopen the PR.

I can add the same mechanism to the classic watchdog as well.

from of-watchdog.

alexellis avatar alexellis commented on May 30, 2024

Hi thanks for your suggestion.

Please read the full contribution guide and edit your proposal so that it lines up with the guidelines provided for you by the maintainers and community.

https://github.com/openfaas/faas/blob/master/CONTRIBUTING.md#i-have-a-great-idea

Your suggestion is a good starting point, but is missing many key details including examples of how to concretely reproduce the issue.

Looking forward to reviewing this again once you've spent some more time on it.

Alex

from of-watchdog.

sargun avatar sargun commented on May 30, 2024

Do you prefer I edit the original comment in place, or append comments to the thread?

I guess what I find unclear is what you mean by "how to concretely reproduce the issue", after looking at the proposal guidelines. The specific issue is that I have an ffmpeg process which takes 2GB of RAM to start, and if too many start at the same time, then the container gets terminated by the OOM killer.

There are other cases where I think this is valuable too (specifically being able to stop GC during the course of a request). I as the operator know how many resources my container has, as well as how much resources my workload will take, therefore I'm the best person to implement this.

Implementing this coordination logic in ffmpeg or flask is difficult (without pulling some interesting tricks with posix advisory locks, or SysV magic, I'm not sure how you'd do it).

Trying to implement this at the gateway, I feel, has several pitfalls. The first of which is that it prevents scale out without a gossip / coordination tier. The second is that it introduces some amount of operational complexity around keeping track of what work the watchdog is doing.

As far as:

  • Effort required up front
    ~12 LoC
  • Effort required for CI/CD, release, ongoing maintenance
    ~100s of LoC

from of-watchdog.

sargun avatar sargun commented on May 30, 2024

Also, I had a slight misunderstanding of the contributing.md. I read:

Please do not raise a proposal after doing the work - this is counter to the spirit of the project.

But it sounds like:

Please do not do any work until a proposal is accepted.

It's very difficult to estimate the cost / complexity / perf implications of a change like this without writing the code. Especially, concurrency-oriented code, where concurrency testing can be difficult. FWIW, I wrote a much more "go-esque" channel based implementation, but it didn't work out from a performance perspective, and testing was problematic because of similar reasons to why the spinlock nastiness is complicated here.

from of-watchdog.

alexellis avatar alexellis commented on May 30, 2024

Hi @sargun

As I mentioned before this is definitely something that I am aware of and that we have talked about before.

Say hello to Justin for me?

I've had a detailed discussion with @stefanprodan and think this is a suitable approach. It also needs to be replicated in the classic watchdog in the openfaas/faas repo.

I think for this to be turned on for users we need to add a retry with exponential backoff in the gateway. As a feature which is off by default that solves your problem I think this can be merged and released quite quickly.

I've tried to re-open the PR but it seems to be locked. Let me know how you'd like to proceed.

Alex

from of-watchdog.

alexellis avatar alexellis commented on May 30, 2024

I've gone ahead and created a sample repository with which the issue can be reproduced and tested before/after: https://github.com/alexellis/fill-memory-limits (this is what I was asking for earlier)

from of-watchdog.

alexellis avatar alexellis commented on May 30, 2024

An additional consideration is that this doesn't help users who are working with Docker images that conform to the OpenFaaS workload - i.e. a stateless Microservice such as a Vert.x app which has decided not to package the of/watchdog inside its container.

At that point perhaps they just need to add the shim and this feature make a better case for having it.

from of-watchdog.

alexellis avatar alexellis commented on May 30, 2024

Derek close: merged / released

from of-watchdog.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.