Code Monkey home page Code Monkey logo

Comments (19)

edenhill avatar edenhill commented on June 30, 2024

Thank you for reporting this, it should now be fixed in master.
if you have configured a delivery report callback (dr_cb) it will be called with err set to RD_KAFKA_RESP_ERR_MSG_SIZE_TOO_LARGE.

Could you please update your librdkafka and verify this fix?

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

It's working in a standalone program. However, it's not working when I run it through my 'real' program. (The one that statically links with librdkafka.a). Not sure why yet. In that case, it just hangs and an starve complains about a futex timing out?

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

Could you try running it in gdb when it hangs and provide the output of:

thread apply all bt

You can mask out your program's traces

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

All say poll() except one that says pthread_cond_timedwait

On Monday, December 30, 2013, Magnus Edenhill wrote:

Could you try running it in gdb when it hangs and provide the output of:

thread apply all bt

You can mask out your program's traces


Reply to this email directly or view it on GitHubhttps://github.com//issues/40#issuecomment-31359677
.

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

Okay, the "futex timing" complaint, what prints that?

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

strace (sorry for the typo before, I'm writing via mobile phone)

On Monday, December 30, 2013, Magnus Edenhill wrote:

Okay, the "futex timing" complaint, what prints that?


Reply to this email directly or view it on GitHubhttps://github.com//issues/40#issuecomment-31362168
.

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

Actually, I'm able to reproduce it now with the standalone program as well. If the max message size on the broker is 4000000 and if I create a message that's 4000000, it fails (as designed). If I try 4000001, it hangs. And actually, even if the broker max is 1000000, looks like 4000001 will also make it hang. So maybe it just can't handle > 4000000 at all...

I'm also not sure on the math, because in my standalone program, I'm not using a key, but yet it will complain about 3999999 as being too large.

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

The message size restriction on the broker might include the header size, which adds a couple of more bytes.

Can you reproduce this with rdkafka_performance?:

rdkafka_performace -P -t <topic> -s <size> -c 1 -b <brokeraddr>

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

I can't reproduce the exact issue with rdkafka_performance, but I am definitely able to reproduce strange behavior, which I'll detail below. I think one difference may be that I'm using 'COPY' and you are using 'FREE' on the produce. But here's the weird behavior with the performance program, which may highlight to you an underlying problem that affects both. The max msg size on the broker in question is 1000000. You will see that numbers larger than 4000000 (so 4000001 and 5000000) both show successful when they should not... There is something about > 4000000 that behaves funny.

./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 999999 | tail -1
% 1 messages and 999999 bytes produced in 56ms: 17 msgs/s and 17.70 Mb/s, 1 messages failed, no compression
./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 1000000 | tail -1
% 1 messages and 1000000 bytes produced in 60ms: 16 msgs/s and 16.56 Mb/s, 1 messages failed, no compression
./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 4000000 | tail -1
% 1 messages and 4000000 bytes produced in 91ms: 10 msgs/s and 43.50 Mb/s, 1 messages failed, no compression
./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 4000001 | tail -1
% 1 messages and 4000001 bytes produced in 0ms: 1517 msgs/s and 6069.80 Mb/s, 0 messages failed, no compression
./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 800000 | tail -1
% 1 messages and 800000 bytes produced in 51ms: 19 msgs/s and 15.57 Mb/s, 0 messages failed, no compression
./rdkafka_performance -b kafkadevcluster1-1.aim.services.masked.com:5757,kafkadevcluster1-2.aim757,kafkadevcluster1-3.aim.services.masked.com:5757 -t LaraReplicator_kafkacluster3 -P -c 1 -s 5000000 | tail -1
% 1 messages and 5000000 bytes produced in 0ms: 5235 msgs/s and 26178 Mb/s, 0 messages failed, no compression

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

This is a problem with rdkafka_performance not counting produce() errors as failed messages:

% Sending 1 messages of size 9000000 bytes
produce error: Message too long
% 1 messages and 9000000 bytes produced in 0ms: 47619 msgs/s and 428571.44 Mb/s, 0 messages failed, no compression

The produce() call fails (returns -1) when the message size is larger than the LOCALLY configured message.max.bytes value (which defaults to 4000000).

But this does not indicate an error on the librdkafka side of things though.

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

ok, but the message delivery callback doesn't get called.... I would have expected it to be called with an error code.

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

It will fail directly for messages that stand no chance of delivery, see here:

https://github.com/edenhill/librdkafka/blob/master/rdkafka.h#L649

* Returns 0 on success or -1 on error in which case errno is set accordingly:
*   ENOBUFS  - maximum number of outstanding messages has been reached:
*              "queue.buffering.max.message"
*   EMSGSIZE - message is larger than configured max size:
*              "messages.max.bytes".

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

ok for now, though feels a bit inconsistent from a user perspective. (some cases via callback and some right after calling).

Should I be calling rd_kafka_err2str on the errno? It doesn't seem to be able to translate it. In this case the errno is set to 90.

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

It might seem inconsistent to have two different error reporting facilities, but it allows the application to take actions immediately:
I.e., for ENOBUFS the application can propogate backpressure or spool the message on disk.

errno is the standard system error codes, use strerror().

One could argue that produce() should return the rd_kafka_resp_err_t codes, that would be more consistent, but it would break existing applications at this point.

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

As long as I can reliably detect errors (which I'll do using both methods), I'm good. Thanks as always for the fast and useful responses.

I guess you'll open a ticket for the minor tweak to the performance tester for error counting?

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

Saw that this issue was closed, but wanted to check whether you plan to enhance the performance binary to properly indicate errors that may have occurred.

'This is a problem with rdkafka_performance not counting produce() errors as failed messages:

% Sending 1 messages of size 9000000 bytes
produce error: Message too long
% 1 messages and 9000000 bytes produced in 0ms: 47619 msgs/s and 428571.44 Mb/s, 0 messages failed, no compression'

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

This was fixed in:
commit d25d782
Date: Wed Jan 1 23:26:07 2014 +0100

rdkafka_performance: Count immediate produce() failures as failed messages too
$ ./rdkafka_performance -P -t onepart -p 1 -c 1 -s 1241241111
% Using random seed 1391083001
% Sending 1 messages of size 1241241111 bytes
produce error: Message too long
% 1 messages and 1241241111 bytes produced in 0ms: 27027 msgs/s and 33547056.00 Mb/s, 1 messages failed, no compression
All messages produced, now waiting for 1 deliveries
% 0 messages in outq
% 1 backpressures for 1 produce calls: 100.000% backpressure rate
% 1 messages and 1241241111 bytes produced in 0ms: 9708 msgs/s and 12050884.00 Mb/s, **1 messages failed**, no compression

from librdkafka.

winbatch avatar winbatch commented on June 30, 2024

ok cool. serves me right for not having tested it again before asking ;)

from librdkafka.

edenhill avatar edenhill commented on June 30, 2024

Well, I had to double check aswell ;)

from librdkafka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.