When I try to create a lot of requests (for example, mapping quite a big list to Promi

Such behavior sort of kills the whole "async" principle. </blockquote

In the meantime, here's the workaround that I use: <div class="highlight highlight

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

"Too many open files" when adding a lot of requests at once about reboot HOT 8 CLOSED

Rogach commented on August 17, 2024

"Too many open files" when adding a lot of requests at once

from reboot.

Comments (8)

n8han commented on August 17, 2024

Such behavior sort of kills the whole "async" principle.

Well, there are limits. Every nio app I've put into production has required bumping up the default linux ulimit, and then things work pretty well. However, I understand that you want something different to happen when you hit a threshold of concurrent operations.

calls with some kind of a proxy Future, which would execute the actual call on some dedicated thread - and that thread would perform blocking using some semaphore

I think a blocked thread would be less efficient than setting a very high ulimit and letting the OS sort it out. And that is what I would recommend doing as a quick fix for your issue.

If we want to support throttling within Dispatch I think it would be with an actor as @slandelle suggests, since we want to maintain a queue of pending requests when too many are believed to be active. (Of course, then we hope the threshold is not exceeded for very long or else we run out of memory -- another limit!) This support could be built into a separate module that depends on actors. I don't think it would be that hard to build, but before I get to that I'll need to finish the 2.10 migration.

For now I'm closing this issue as there is will be no immediate work on it, but feel free to raise it on the mailing list if you want to get more feedback on how others manage the same problem.

from reboot.

Rogach commented on August 17, 2024

In the meantime, here's the workaround that I use:

  def safePromiseAll[A](promises: Seq[() => Promise[A]]): Seq[A] = {
    val semaphore = new java.util.concurrent.Semaphore(5, true)
    promises.map { pr =>
      semaphore.acquire()
      pr().onComplete { _=>
        semaphore.release()
      }
    }.map(_())
  }

from reboot.

jackcviers commented on August 17, 2024

@n8han @Rogach high ulimits are the way to go.

I don't think request throttling is dispatch's responsibility. If a user needs to make a lot of requests, something which can inherently fail, and also needs to guarantee some kind of success, there are patterns, like error kernels or request pools, to handle that use case. I don't think mutex blocking is the solution -- dependent requests that yield results like in the examples from the reboot site limit the number of open sockets naturally and provide a way to manage failure and retries. Throttling is the responsibility of either the user or the underlying NIO library.

from reboot.

Rogach commented on August 17, 2024

@jackcviers Actually, I hit the aforementioned problem when doing dependent requests - it's just that those dependent requests resulted in exponential increase of requests to be made, which lead to problem.

Consider the following case - simple web crawling - each page that is downloaded generates more links to crawl. Given the current implementation, errors begin to happen very quickly - and since it's not documented, and API lends itself to this way of coding, it's an easy mistake to make.

Also, you're going to see problems well before the ulimit is exceeded - 100 concurrent requests already overload the interface, resulting in timeouts and other errors.

My point is, either we remove Promise.all from the library completely and state in documentation that user should be very careful with his requests, or implement throttling in the library.

from reboot.

polymorphic commented on August 17, 2024

@Rogach could you please expand on your workaround? I'm seeing similar problems in an Akka system where a set of actor make POST requests via scalaxb (which under the covers uses dispatch). Any other insight into preventing hitting the limit? For example, a way of querying how many requests are in progress would make it possible to rate-limit somewhere upstream.

from reboot.

Rogach commented on August 17, 2024

@polymorphic - I don't see an easy way for querying number of currently executing requests - you'll probably need to go all the way down to AsyncHttpClient internals for that, and it's not going to be pretty.

As I see it, in your case (many separate actors querying the same Http instance), you basically have two options: either implement some sort of token mechanism with a separate actor, which would manage the querying and send the results back to requesting actor, or you can go with the semaphore route - just carry the semaphore alongside the Http instance (you probably can even create a wrapper).

I actually think that it could be possible to build the semaphore stuff directly into dispatch, thus retaining the pretty syntax and control flow, while keeping the whole interaction safe from ulimit and related exceptions - but that would lead to the possibility of blocking in the submit calls, which is not good (but can also be worked around on client side).

from reboot.

polymorphic commented on August 17, 2024

@Rogach thank you for the answer. I can't afford to go ahead without capping the ulimit; I tested and saw it go over 10K connections but that's asking for trouble :-/

I've tried capping the number of simultaneous connections through configuring the AsyncHttpClient, which provides a ThrottleRequestFilter that implements blocking along the lines of your workaround. That hasn't stopped it from growing though; I'm doing this from inside a scalaxb-generated WSDL client, perhaps I missed something. Other suggestions welcome, I bet others have had to deal with this.

from reboot.

oliviertoupin commented on August 17, 2024

I get the "too many open files" errors using only 4 connection (4 hosts), but by calling several time the ressource than span 4 connections. I run it a maximum of 10 time before it breaks, however, I get 10000 open files. That's 10000 files for 40 connections, the numbers don't add up.

Using lsof, I can see that there is ~6500 pipe, ~3200 anon_inode, and only 400 socket (and most of them were there before the dispatch call).

So as you can see, there might be an still be an issue, there is way too many open files for the number of requests.

I'm using dispatch 0.11.

from reboot.

"Too many open files" when adding a lot of requests at once about reboot HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent