Code Monkey home page Code Monkey logo

Comments (8)

n8han avatar n8han commented on June 16, 2024

Such behavior sort of kills the whole "async" principle.

Well, there are limits. Every nio app I've put into production has required bumping up the default linux ulimit, and then things work pretty well. However, I understand that you want something different to happen when you hit a threshold of concurrent operations.

calls with some kind of a proxy Future, which would execute the actual call on some dedicated thread - and that thread would perform blocking using some semaphore

I think a blocked thread would be less efficient than setting a very high ulimit and letting the OS sort it out. And that is what I would recommend doing as a quick fix for your issue.

If we want to support throttling within Dispatch I think it would be with an actor as @slandelle suggests, since we want to maintain a queue of pending requests when too many are believed to be active. (Of course, then we hope the threshold is not exceeded for very long or else we run out of memory -- another limit!) This support could be built into a separate module that depends on actors. I don't think it would be that hard to build, but before I get to that I'll need to finish the 2.10 migration.

For now I'm closing this issue as there is will be no immediate work on it, but feel free to raise it on the mailing list if you want to get more feedback on how others manage the same problem.

from reboot.

Rogach avatar Rogach commented on June 16, 2024

In the meantime, here's the workaround that I use:

  def safePromiseAll[A](promises: Seq[() => Promise[A]]): Seq[A] = {
    val semaphore = new java.util.concurrent.Semaphore(5, true)
    promises.map { pr =>
      semaphore.acquire()
      pr().onComplete { _=>
        semaphore.release()
      }
    }.map(_())
  }

from reboot.

jackcviers avatar jackcviers commented on June 16, 2024

@n8han @Rogach high ulimits are the way to go.

I don't think request throttling is dispatch's responsibility. If a user needs to make a lot of requests, something which can inherently fail, and also needs to guarantee some kind of success, there are patterns, like error kernels or request pools, to handle that use case. I don't think mutex blocking is the solution -- dependent requests that yield results like in the examples from the reboot site limit the number of open sockets naturally and provide a way to manage failure and retries. Throttling is the responsibility of either the user or the underlying NIO library.

from reboot.

Rogach avatar Rogach commented on June 16, 2024

@jackcviers Actually, I hit the aforementioned problem when doing dependent requests - it's just that those dependent requests resulted in exponential increase of requests to be made, which lead to problem.

Consider the following case - simple web crawling - each page that is downloaded generates more links to crawl. Given the current implementation, errors begin to happen very quickly - and since it's not documented, and API lends itself to this way of coding, it's an easy mistake to make.

Also, you're going to see problems well before the ulimit is exceeded - 100 concurrent requests already overload the interface, resulting in timeouts and other errors.

My point is, either we remove Promise.all from the library completely and state in documentation that user should be very careful with his requests, or implement throttling in the library.

from reboot.

polymorphic avatar polymorphic commented on June 16, 2024

@Rogach could you please expand on your workaround? I'm seeing similar problems in an Akka system where a set of actor make POST requests via scalaxb (which under the covers uses dispatch). Any other insight into preventing hitting the limit? For example, a way of querying how many requests are in progress would make it possible to rate-limit somewhere upstream.

from reboot.

Rogach avatar Rogach commented on June 16, 2024

@polymorphic - I don't see an easy way for querying number of currently executing requests - you'll probably need to go all the way down to AsyncHttpClient internals for that, and it's not going to be pretty.

As I see it, in your case (many separate actors querying the same Http instance), you basically have two options: either implement some sort of token mechanism with a separate actor, which would manage the querying and send the results back to requesting actor, or you can go with the semaphore route - just carry the semaphore alongside the Http instance (you probably can even create a wrapper).

I actually think that it could be possible to build the semaphore stuff directly into dispatch, thus retaining the pretty syntax and control flow, while keeping the whole interaction safe from ulimit and related exceptions - but that would lead to the possibility of blocking in the submit calls, which is not good (but can also be worked around on client side).

from reboot.

polymorphic avatar polymorphic commented on June 16, 2024

@Rogach thank you for the answer. I can't afford to go ahead without capping the ulimit; I tested and saw it go over 10K connections but that's asking for trouble :-/

I've tried capping the number of simultaneous connections through configuring the AsyncHttpClient, which provides a ThrottleRequestFilter that implements blocking along the lines of your workaround. That hasn't stopped it from growing though; I'm doing this from inside a scalaxb-generated WSDL client, perhaps I missed something. Other suggestions welcome, I bet others have had to deal with this.

from reboot.

oliviertoupin avatar oliviertoupin commented on June 16, 2024

I get the "too many open files" errors using only 4 connection (4 hosts), but by calling several time the ressource than span 4 connections. I run it a maximum of 10 time before it breaks, however, I get 10000 open files. That's 10000 files for 40 connections, the numbers don't add up.

Using lsof, I can see that there is ~6500 pipe, ~3200 anon_inode, and only 400 socket (and most of them were there before the dispatch call).

So as you can see, there might be an still be an issue, there is way too many open files for the number of requests.

I'm using dispatch 0.11.

from reboot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.