Comments (11)
I can prove this with an Etch-A-Sketch (so to speak): https://github.com/gunnihinn/queuesim
from carbonapi.
@gunnihinn I would be interested in seeing a ring-buffer added to the model.
Ring-buffers are sometimes found in soft-realtime applications like audio streams because during periods of overload, the dropped data is experienced as a "hickup" in the stream, but the content continues to (largely) be sensible. By contrast, dropping data non-contiguously is more likely to create nonsense and noise. Perhaps using a ring-buffer would be a sensible approach to shedding in this case?
(Another aspect which can be an advantage is that the buffer is allocated at the start and remains constant size for the duration of execution, however I guess that is probably less important here.)
from carbonapi.
You actually don't want a FIFO queue.
Under normal, happy conditions it doesn't matter what type of queue you have because almost everything gets served.
When you're under heavy load and need to start shedding some to survive, using a FIFO queue can cause you to time out almost all the requests sitting in your queue if you pick a few heavy requests from it in a row. If you instead pick requests randomly, or using a FILO queue, you increase the expected number of requests you can serve even under that load.
IIRC, the first Google SRE book has some discussion of this in one of the chapters on load shedding or cascading failures. Andre Vereira at Booking also knows some stuff about this if you want a second face-to-face opinion.
from carbonapi.
Here's an illustration of what happens https://play.golang.org/p/V4riLAtcUAq
from carbonapi.
If you're looking to optimise throughput in scenarios with many concurrent requests, there're some systems out there that adopted LIFO queues. Could be interesting to see if this strategy can be useful here.
from carbonapi.
@ksurent Thanks for the suggestions. We will be able to compare after we do the refactoring.
I think, in this case, we need something simple. Just a classic limited-length requests queue instead of a semaphore. Or something like that.
from carbonapi.
Sure, I was just suggesting a different queueing discipline (LIFO vs FIFO).
from carbonapi.
Thanks for all the insight you shared. This is a much more interesting problem than it seems at first glance. The suggestion to go with FIFO looked to be the simplest approach but probably is not the best one.
I will invest time to study the topic in depth and go through all the references (and the model) you gave.
from carbonapi.
I can prove this with an Etch-A-Sketch (so to speak): gunnihinn/queuesim
This model does not include the case described in the first post. Also, it operates only on bounded queues, while our current "queue" is not bounded (I am not claiming it is a good thing though). The queue limiting happens by timeouts application, thus yielding a CoDel queue.
Correct me if I am getting something wrong.
While going through various materials, I agree that FIFO most likely is not the best solution. Currently, @ericherman's proposal of ring-buffer looks best to me.
from carbonapi.
This model does not include the case described in the first post.
Indeed; I only wanted to point out that FIFO queues are maybe not the best option, and that analyzing the different options is a somewhat nontrivial task. It's your show, go with whatever performs best in whatever tests you setup.
Currently, @ericherman's proposal of ring-buffer looks best to me.
Well... is it better than the one you already have?
from carbonapi.
Well... is it better than the one you already have?
Currently we make a goroutine for each request. These are waiting in "queue" for the semaphore. The problem is that there can be very many of goroutines in queue (up to 300k
in one of our setups). This takes a lot of memory (8kB x 300k = 2.4GB
, 8kB
being initial goroutine stack size) and slows down GCs up to 1 second
. This also creates contention between goroutines that slows everything down.
We need to move queuing from goroutines to some data structure. It is the discussion here about which one. Also, we need to make it bounded. Currently, it's not. Because of this, we sometimes get explosions in the number of goroutines.
It is clear that FIFO is not a good bounded structure (thanks for the model!). LIFO is not bounded, neither is a pool with random selection. We would need to make some bounding strategy on top (as it is described in the resources shared above). Compared to these, ring buffer seems to be the simplest solution. It is also naturally resilient.
But this is just a guess, we will see when we get to test these. Just explaining my thinking.
from carbonapi.
Related Issues (20)
- Fix errorlint violations and re-enable it
- Compatibility with graphite-web 1.1.x? HOT 1
- Fix "Improper Input Validation in GoGo Protobuf" security issue
- Stop using zapwriter library
- CI actions succeed even though they contain errors. Unit tests fail. HOT 1
- Faulty floats comparison leads to tests intermittently failing on some machines. Rewrite floats comparison.
- CI fails during linting because it runs out of memory. HOT 1
- Redesign the request processing pipeline.
- Upgrade to go1.19.2
- observability: Introduce basic tracing with logs.
- Improve replicated memcached availability and latency.
- Merge carbonapi and zipper binaries into a single one.
- Refactor and optimize processing after zipper merge.
- Merge zipper config into the main config.
- Merge zipper metrics into the main metrics set.
- Move all config documentation to the code.
- Eliminate global state represented by the App in carbonapi and zipper.
- Add metrics covering the whole path of a request during processing. HOT 1
- Full-spectrum integration testing system. HOT 1
- security issues curl and libcurl packages
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from carbonapi.