Comments (7)
Hard code a smaller buffer size
Seems rather sad.
Dynamically shrink the buffer size if we determine it is not working
This is the approach I've gone for, primarily because it should be able to stop other pathological behaviour.
Read the buffer backwards from our seek point if we detect we are seeking backwards
This might be a nice option to get syncing going still faster, but it's also fiddly and only solves this exact problem. I'll settle for having it no worse than it was in 3.4.4.
from rabbitmq-server.
Just for clarity, this bug:
- Only affects 3.5.0
- Requires messages to be larger than the queue index embedding threshold (by default 4kB)
- Requires messages to be paged out before synchronisation starts
You can see in the I/O stats on the master that if (say) 250 messages are read from disk per second, we also read 250MB/s even if the messages are much smaller than that.
from rabbitmq-server.
Here is what I did to test the correction:
-
I start two nodes, A and B, with a very low
vm_memory_high_watermark
to make them page messages out early, clustered them and added the following HA policy on node B:rabbitmqctl -n B set_policy ha-all "." '{"ha-mode":"all"}'
-
I stopped node B using:
rabbitmq -n B stop_app
-
I used PerfTest to publish 10 kB messages with a rate-limited consumer so messages stay in RabbitMQ:
PerfTest -s 10240 -R 100
-
The producer could publish around 40,000 messages before being throttled.
-
I started node B again and force synchronisation from the management UI.
With the stable
branch, the management UI reports I/O read rates of:
- 150 messages/s
- 150 MB/s
With the rabbitmq-server-69
branch (this fix), it reports:
- 1000 messages/s
- 15 MB/s
I logged the size of the read buffer in file_handle_cache.erl
at the same time. With stable
, the buffer remains at an expected 1MB size. With the fix, the size continuously switches between 10468 and 20936, with an occasional jump to 4 MB.
from rabbitmq-server.
Note that you don't need the -R 100
, you can use -y0 -u test -p
to get PerfTest
to publish to a queue with no consumers which might be easier to work with.
The 4MB sizes probably refer to other files (queue index files?)
Not sure whether the flicking between 10468 and 20936 is worth fixing, what do you think?
from rabbitmq-server.
Oh, also you can set a very low vm_memory_high_watermark_paging_ratio
rather than vm_memory_high_watermark
, that way you can publish indefinitely but get paged out rapidly.
from rabbitmq-server.
One correction to my previous comment:
150 messages/s
1000 messages/s
Those should read:
- 150 reads/s
- 1000 reads/s
The 4MB sizes probably refer to other files (queue index files?)
You're right, the file handle differs for those reads.
Not sure whether the flicking between 10468 and 20936 is worth fixing, what do you think?
After a test:
- With the flickering buffer:
- 1000 reads/s
- 15 MB/s from disc
- 10-12 MB/s sent to node B
- With a constant buffer:
- 750 reads/s
- 7 MB/s from disc
- 7 MB/s sent to node B
In the first case, we read 20 kB to only use 10 kB, then we read 10 kB, then we double and so on. We don't do this in the second case (we always read 10 kB). When comparing the number of reads to the throughput, we see the 33% decrease of throughput in the second case, corresponding to not wasting 10 kB. However, I can't explain why it is slower...
from rabbitmq-server.
Here are new, more meaningful numbers comparing stable
and 8faf4ee.
The protocol is:
- Start nodes A and B, cluster them, add a HA policy.
- Create a queue from the management UI.
- Stop node N.
- Use PerfTest to queue 300,000 messages, which is enough to page them out (the filesystem is tmpfs). No clients are connected after that.
- Start B and force synchronization. While this happens, look at the time the full sync takes, as well as I/O and network statistics.
Results with stable
:
- Synchronization finished in 1'55".
- Reads: 1600/s (1.6 GiB/s)
- Network (from A to B): 18 MiB/s while messages are paged in, then 58 MiB/s
Results with 8faf4ee:
- Synchronization finished in 1'10".
- Reads: 4500/s (57 MiB/s)
- Network (from A to B): 58 MiB/s
from rabbitmq-server.
Related Issues (20)
- auth_oauth2.jwks_url is always verified HOT 1
- Add more useful data into rabbitmq_cluster_vhost_status metric
- 4.x: investigate if management plugin's TLS options key can be renamed to ssl_options for consistency HOT 6
- Deprecate `queue_master_locator` HOT 1
- x-death count not incremented when message expired HOT 10
- Prevent excessive logging in certain failure scenarios HOT 3
- RabbitMQ 3.13.0 nodes with peer discovery enabled fails to start in an IPv6-only environment HOT 24
- AMQP 1.0 Shovels: expose additional capabilities needed for successful connection to some AMQP 1.0 brokers HOT 3
- Khepri: timeouts when one of the nodes stops responding
- RabbitMQ 3.13.0 nodes with Consul peer discovery enabled fails to form a cluster HOT 1
- Odd characters in example file deps/rabbit/docs/rabbitmq.conf.example l.801 HOT 1
- Popup does not close in Managemet UI
- A `rabbitmq-queues grow` equivalent is missing for streams HOT 2
- Channel Metrics Cardinality - /metrics stops working HOT 3
- Unhandled function clause exception when retrieving queue for nonexistent virtual host HOT 2
- Add the ability to specify `$node` in log formatter format string
- HTTP API: GET /api/queues/{vhost}/{name} can return duplicate keys for quorum queues HOT 3
- 3.11: two Raft replicas are in the timeout state, one is a candidate HOT 5
- OpenID compliance check is not based on the final specification
- Possible race condition in classic queue deletion/declaration handling for multi-node clusters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rabbitmq-server.