Code Monkey home page Code Monkey logo

Comments (2)

ryandotsmith avatar ryandotsmith commented on August 26, 2024

Hello, @eprothro
Sorry for the delayed response. This week was incredibly busy for me. Anyways, you raise some really interesting questions. Lets try and tease them apart to gain a better understanding of what is going on.

First of all, while I greatly value theoretical optimizations, they are no replacement for experimental evidence. Have you tried benchmarking a Unicorn/NGINX/Heroku stack with various permutations of the backlogs? If so, care to share your results? It has been a while since I did this type of experimentation, and the last time I did, I neglected to publish my results. Perhaps we can collaborate on such a project.

Secondly, what is the problem that you are trying to solve? Do you have latency problems? Are you trying to trim down your 99th percentile latency? Do you have connection timeout issues? Is the Heroku router returning H12s?

Finally, let me take a stab at addressing some of your direct questions:

Does this mean in the case of this buildpack's configuration that the NGINX 'backlog depth' = max number of clients is 1024 = ( 4 * ( 1024 / 4) ), or does this math change given the router layer's handling of connections to and from the client?

Correct. Since NGINX is in a reverse-proxy mode, and since we have worker_connections set to 1024 and since we are using 4 worker processes, the maximum number of clients we can handle is 1024.

Do connections for which NGINX is buffering the request body or responses count against this worker_connections?

Absolutely. This is one of the great advantages of using NGINX in the way we have it configured. If you are working with slow clients, NGINX will use one of its worker connections to handle the byte transfer. This transfer happens prior to occupying a slot in the Unicorn backlog (i.e. the listen backlog on the socket).

What happens when NGINX max clients is hit? (e.g. is the next request gracefully redistributed to another dyno?)

From the docs, it appears that a HTTP 503 response is issued. I would like to test this locally to make sure.

from nginx-buildpack.

eprothro avatar eprothro commented on August 26, 2024

Hi @ryandotsmith, thanks very much for your response. My sincere apologies for just now getting back to you. I had to revert to not using NGINX and move on from this problem, and then I also just recently had a new kiddo.

However, I at least owe you a response:

To answer your question directly, no I didn't have time to benchmark tweaking the backlog with NGINX in the stack. I had already spent a few days (that I didn't have to begin with) benchmarking potential correlations between client bandwidth and request queueing.

I'd be more than happy to collaborate with you as time allows, but I also know that we're both very busy ;).

Ultimately, the problem I'm trying to solve is 'zombie' dynos (100% of requests sent to a dyno result in an H12 for anywhere from a few seconds to a few minutes). I've inherited a couple different poorly performing applications over the years that suffered from this problem. Recently, for a slew of reasons, I started to wonder if client bandwidth could be a factor.

I hesitate to mention this as the problem I'm trying to solve because the implications and discussion/troubleshooting around those implications are so application specific and way outside this scope of this issue. Moreover, the 'correct' answer here is 'make the app's slow actions faster', however accomplishing that in a timely manner isn't necessarrily possible for a lot of reasons not worth going into.

However, I don't feel limiting the backlog is only a theoretical optimization given Heroku's random routing algorithm and my experience. Given either of the above applications, when I limit the backlog depth (using Unicorn's configuration), I've seen (experimentally, with real traffic) that the frequency and impact (number of timeouts before the H12s stop) are both reduced with a very small backlog (like 2N, where N=unicorn concurrency).

This makes perfect sense, statistically, to me; and I'd be happy to discuss if desired.


However, it sounds like trying to achieve this (configuring the stack to use backlog depth to achieve load balancing) with NGINX in the stack would be a problem, as when max backlog is reached, instead of refusing the connection from the router mesh, it responds to the client with a 503.

Hope this helps with some context.

from nginx-buildpack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.