Code Monkey home page Code Monkey logo

Comments (14)

jmikola avatar jmikola commented on September 17, 2024 2

@aleshaczech: This is caused by libmongoc including a saslSupportedMechs field in outgoing monitoring commands when it should only do so for the initial handshake. I've written up a detailed report of the issue in CDRIVER-4812, so please watch that for further updates.

This can likely be addressed for the upcoming libmongoc 1.26 release, which would then be included in the next PHPC minor version that follows.

In the meantime, you should be able to work around this by explicitly specifying authMechanism=SCRAM-SHA-256 in your connection string (or the URI options array). See MongoDB\Driver\Manager::__construct() and/or Authentication Options: authMechanism in the MongoDB manual for more information. By explicitly defining an auth mechanism, you'll bypass the driver's default behavior to attempt negotiation and the field will never be appended to the handshake or subsequent monitoring commands (even though it's only the latter that causes this issue).

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024 1

Thank you, @jmikola !

This indeed seems to fix the issue, my logs are now a very quiet place. :-)

I guess we can close this ticket, since the issue is now tracked in Jira. If that's not right, please feel free to reopen and/or let me know.

Thanks again!

from mongo-php-driver.

jmikola avatar jmikola commented on September 17, 2024

I traced the introduction of AuthenticationAbandoned(337) back to mongodb/mongo@5b83d4c and SERVER-52863.

Some parts of the application run as long-running PHP processes and use MongoDB's db.collection.watch(), so, they are connected to MongoDB indefinitely. The errors are being logged for a few different connections. I am not sure, though, whether these errors are produced only by the long-running processes or if it's also the case of the HTTP requests-related connections.

If you'd like to further identify the connections, you could look into using the appName URI option in either in the connection string or $uriOptions parameter to the Manager or Client constructor. The option is discussed in more detail in the MongoDB\Driver\Manager::__construct() docs and should allow you append an identifier to server-side log messages for connections associated with that URI. In this case, you could use appName to differentiate the long-running PHP processes from the web app, as well as any connections from Node.js.

I'm not familiar with istio, but as a proxy is it possible that the driver-to-proxy connection is terminating while the proxy-to-server connection is not? In that case, we might see the driver reestablishing a connection and sending the initial auth handshake (as it should); however, the proxy only ends up passing the auth handshake over to the server, where it's unexpected due to there already being an existing auth session.

I'd also like to confirm that there's nothing in istio that would result in multiple driver connections being multiplexed across a single server connection. In other words, can we safely assume that "conn704" in the logs above corresponds to exactly one instance of the PHP driver? If the logs are split across several minutes, I think the likely suspect would be whatever process is watching a change stream (e.g. MongoDB\Collection::watch()).

If you're willing to debug that process further, you could register an SDAM logger within the process and have that log output (to a stream, file, or logging service, whichever you prefer). Using the sdam_logger.php example script as a base, you can build upon the subscriber class and replace the printf() calls to log to your desired output. Since authentication is performed as part of connection initialization, I would expect the serverOpening events to be of particular interest -- but it'd be worth logging all events for review. The log lines in the example script are mostly one-liners, but if you'd like to add additional information from these events you can explore the API reference for the event classes in the extension documentation.

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

Hi @jmikola,

thank you for your quick and helpful advice, it helped me a lot to progress on the issue!

I was able to tag all the connections with the appName option and made the following new observations:

  • All connections from my application are affected, both from the long-running scripts doing MongoDB\Collection::watch(), as well as connections from the regular web traffic.
  • The aforementioned IP address 127.0.0.6 seems to be OK, at least the connections from the Node.js application report the same IP address and it seems to work well. I might explore that later a bit further, but I don't believe that's related to the reported issue.
  • When I shutdown all PHP processes (the long running scripts and PHP-FPM process), MongoDB immediatelly closes all the connections. This rules out istio as the root cause, it's apparently the PHP processes who keep the connections open.

I then deployed a new pod with the same PHP version and configuration (e.g. the same MongoDB library version, using the same MongoDB servers, the same credentials and the same database) and tested with the following script:

<?php

require_once(__DIR__ . '/vendor/autoload.php');

$connURI = getenv('APP_MONGO_CONN_URI');

$manager = new MongoDB\Driver\Manager($connURI);
$cmd = new MongoDB\Driver\Command(['ping' => true]);
$res = $manager->executeCommand(getenv('APP_MONGO_DB'), $cmd);

// I tried the same also with the MongoDB library
// $client = new MongoDB\Client($connURI);
// $client->selectDatabase(getenv('APP_MONGO_DB'))->command(['ping' => true]);

echo "OK", PHP_EOL;

In this case, all went well, here are the logs from both MongoDB servers.

The first server, acting as SECONDARY:

{"t":{"$date":"2024-01-05T15:00:54.011+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.6:44385","uuid":{"uuid":{"$uuid":"992aefdc-2e66-42d3-b517-f8066cf4a185"}},"connectionId":1078,"connectionCount":18}}
{"t":{"$date":"2024-01-05T15:00:54.014+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1078","msg":"client metadata","attr":{"remote":"127.0.0.6:44385","client":"conn1078","doc":{"application":{"name":"test"},"driver":{"name":"mongoc / ext-mongodb:PHP ","version":"1.25.2 / 1.17.2 "},"os":{"type":"Linux","name":"Alpine Linux","version":"3.19.0","architecture":"x86_64"},"platform":"PHP 8.3.0 cfg=0x03d21ea8f9 posix=200809 stdc=201710 CC=GCC 13.2.1 20231014 CFLAGS=\"\" LDFLAGS=\"\""}}}
{"t":{"$date":"2024-01-05T15:00:54.014+00:00"},"s":"I",  "c":"ACCESS",   "id":6788604, "ctx":"conn1078","msg":"Auth metrics report","attr":{"metric":"acquireUser","micros":0}}
{"t":{"$date":"2024-01-05T15:00:54.086+00:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn1078","msg":"Connection ended","attr":{"remote":"127.0.0.6:44385","uuid":{"uuid":{"$uuid":"992aefdc-2e66-42d3-b517-f8066cf4a185"}},"connectionId":1078,"connectionCount":17}}
{"t":{"$date":"2024-01-05T15:00:54.086+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn1078","msg":"Failed to authenticate","attr":{"client":"127.0.0.6:44385","isSpeculative":true,"isClusterMember":false,"mechanism":"SCRAM-SHA-256","user":"user","db":"db","error":"AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected","result":337,"metrics":{"conversation_duration":{"micros":71739,"summary":{"0":{"step":1,"step_total":2,"duration_micros":127}}}},"extraInfo":{}}}

The second server, acting as PRIMARY:

{"t":{"$date":"2024-01-05T15:00:54.016+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.6:35453","uuid":{"uuid":{"$uuid":"f2f20178-a308-4123-8f01-1c1ca85c5056"}},"connectionId":1538,"connectionCount":29}}
{"t":{"$date":"2024-01-05T15:00:54.021+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1538","msg":"client metadata","attr":{"remote":"127.0.0.6:35453","client":"conn1538","doc":{"application":{"name":"test"},"driver":{"name":"mongoc / ext-mongodb:PHP ","version":"1.25.2 / 1.17.2 "},"os":{"type":"Linux","name":"Alpine Linux","version":"3.19.0","architecture":"x86_64"},"platform":"PHP 8.3.0 cfg=0x03d21ea8f9 posix=200809 stdc=201710 CC=GCC 13.2.1 20231014 CFLAGS=\"\" LDFLAGS=\"\""}}}
{"t":{"$date":"2024-01-05T15:00:54.021+00:00"},"s":"I",  "c":"ACCESS",   "id":6788604, "ctx":"conn1538","msg":"Auth metrics report","attr":{"metric":"acquireUser","micros":0}}
{"t":{"$date":"2024-01-05T15:00:54.075+00:00"},"s":"I",  "c":"ACCESS",   "id":5286306, "ctx":"conn1538","msg":"Successfully authenticated","attr":{"client":"127.0.0.6:35453","isSpeculative":true,"isClusterMember":false,"mechanism":"SCRAM-SHA-256","user":"user","db":"db","result":0,"metrics":{"conversation_duration":{"micros":53660,"summary":{"0":{"step":1,"step_total":2,"duration_micros":116},"1":{"step":2,"step_total":2,"duration_micros":51}}}},"extraInfo":{}}}
{"t":{"$date":"2024-01-05T15:00:54.077+00:00"},"s":"I",  "c":"NETWORK",  "id":6788700, "ctx":"conn1538","msg":"Received first command on ingress connection since session start or auth handshake","attr":{"elapsedMillis":2}}
{"t":{"$date":"2024-01-05T15:00:54.088+00:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn1538","msg":"Connection ended","attr":{"remote":"127.0.0.6:35453","uuid":{"uuid":{"$uuid":"f2f20178-a308-4123-8f01-1c1ca85c5056"}},"connectionId":1538,"connectionCount":28}}

TLDR: the first (SECONDARY) server closed the connection with AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected, while the second (PRIMARY) server closed the connection cleanly.

I am not sure whether the way how the first server closes connections is correct, but at least the connections in both MongoDB servers are closed immediately and cannot be found anymore with db.currentOp().

That's not, however, the case of the production application that for some reason leaves the connection open. The difference between the test application and the production application seems to be somewhere in Symfony or Doctrine, so I will try the SDAM debugging script to identify the difference and report the updates here.

Thank you!

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

I have some more debugging insights.

TLDR is:

  • The issue happens for all connections created from a long-running PHP script, either as a long-running CLI script or a PHP-FPM configured to serve multiple requests.
  • It's the PHP script that keeps the connection open. When the script finishes (or the PHP-FPM child process respawns), the connection is immediately closed.
  • The issue does not happen with the disableClientPersistence option set to true.
  • The issue for a particular connection happens earliest after 60 seconds it happened last time or, in the case of the first occurence after 120 seconds since the connection was established, and only happens whenever a new request for that particular connection comes.
  • The issue does not happen for a Node.js client.
  • The issue does not happen in local development environment, identical to production except for not running in K8s / istio.
  • There might be probably another issue, when connecting to MongoDB replica set from a PHP CLI script, connection to PRIMARY closes cleanly, but connection to SECONDARY is closed with AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected.

I still did not get to deploying the SDAM logger, because I made some new observations in my local environment. Specifically, the issue does not appear locally, despite me having the very same application's Docker image and the very same MongoDB's image, and the database also running as a replica set. Application-wise, the local environment is exactly the same as the production environment, the difference between those two is only that production application runs in a Kubernetes cluster extended by istio.

Then I also realised that I had made a mistake with my test script above - because I ran it as a CLI, meaning the PHP process finished immediately and that was the reason for the connection to be closed cleanly. When I triggered the same testing script over a HTTP request in a PHP-FPM process, the connection stayed open and the issue appeared again.

So, it seems that the PHP process keeps the connection open, whether it's a long-running process or a PHP-FPM process - because in both cases the process keeps running. Only when the logic is ran as a separate PHP process that finishes immediatelly afterwards, the issues is not observed. To verify that, I reconfigured PHP-FPM with pm.max_requests = 1 and the issue disappeared, obviously because the connections got closed after each request. Also, when I set the disableClientPersistence option to true, the issue disappeared.

Lastly, when I reported the bug, I stated that the issue appeared every "60-120 seconds". However, that's not accurate. It seems that the issue happens with a first request that comes after 60 seconds since the message appeared last time, independently on whether there was meanwhile silence on the same connection or whether there were other requests.

To illustrate that, I let the test script ran every second over HTTP as follows:

watch -n 1 -- wget http://localhost/index.php -O - >/dev/null

The AuthenticationAbandoned: Overridden by new authentication session error message for the particular connection happened every 60 seconds (plus minus a few milliseconds). The first occurence, however, happened only after 120 seconds since the connection was created.

Then I triggered the wget once, waited 90 seconds, triggered it again and the issue appeared immediatelly... then for another 60 seconds the issue did not appear and then appeared again.

Lastly, there still seems to be issue mentioned in my previous message that when ran as a one-off CLI script, the connections to the PRIMARY closes cleanly, but the connection to SECONDARY is closed with AuthenticationAbandoned: Authentication session abandoned, client has likely disconnected. Is that on purpose?

I will go and try the SDAM logger now.

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

I tried the SDAM debugging. It seems that after the initial request, there is a heartbeat request every 60 seconds, and the first one does not produce the aforementioned log message, while all the following ones do. I didn't notice anything particular in the logs.

The log that appeared with the first connection:

topologyOpening: 6598519ef57b3b39ea09ab40 was opened
topologyChanged: 6598519ef57b3b39ea09ab40 changed from Unknown to Unknown
serverOpening: mongodb-internal-0.mongodb-internal.services:27017 was added to topology 6598519ef57b3b39ea09ab40
serverOpening: mongodb-internal-1.mongodb-internal.services:27017 was added to topology 6598519ef57b3b39ea09ab40
serverHeartbeatStarted: mongodb-internal-0.mongodb-internal.services:27017 heartbeat started
serverHeartbeatStarted: mongodb-internal-1.mongodb-internal.services:27017 heartbeat started
serverHeartbeatSucceeded: mongodb-internal-0.mongodb-internal.services:27017 heartbeat succeeded after 4398µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "65939774669b4a38f7e9de0a" }, "counter" : 5 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "ismaster" : false, "secondary" : true, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-0.mongodb-internal.services:27017", "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T18:59:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T18:59:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T18:59:42.854Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1319, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "speculativeAuthenticate" : { "conversationId" : 1, "done" : false, "payload" : { "$binary" : { "base64" : "cj05amhmdlE4b01OcEdNVjNxdVlmVERqM09UQ2JycFJsVTFOUEhGOXNzYU1nR3ltbnhxNVF6Nk1yZmZOQ1ZpOHllLHM9VFR6cCtqejBLeTZzWnhtQnMrNWdNYmNRUERmZHFhNXdrQmtoelE9PSxpPTE1MDAw", "subType" : "00" } } }, "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "cS1Iqh7LzmAgJosLn1wLzzj269I=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } } }
serverChanged: mongodb-internal-0.mongodb-internal.services:27017 changed from Unknown to RSSecondary
previous hello response: { }
new hello response: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "65939774669b4a38f7e9de0a" }, "counter" : 5 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "ismaster" : false, "secondary" : true, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-0.mongodb-internal.services:27017", "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T18:59:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T18:59:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T18:59:42.854Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1319, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "cS1Iqh7LzmAgJosLn1wLzzj269I=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } } }
topologyChanged: 6598519ef57b3b39ea09ab40 changed from Unknown to ReplicaSetNoPrimary
serverHeartbeatSucceeded: mongodb-internal-1.mongodb-internal.services:27017 heartbeat succeeded after 9652µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "659397206a611ffcefcf29a4" }, "counter" : 12 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "ismaster" : true, "secondary" : false, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-1.mongodb-internal.services:27017", "electionId" : { "$oid" : "7fffffff0000000000000008" }, "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T18:59:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T18:59:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T18:59:42.860Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1814, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-256", "SCRAM-SHA-1" ], "speculativeAuthenticate" : { "conversationId" : 1, "done" : false, "payload" : { "$binary" : { "base64" : "cj1lT245d211UE1oT3pxZ2k4MndERjRFTlFSckNYblJQNnh5MVAxajhGREFOZFlBWC8wSFZPWm50VjI5ZER1RzFYLHM9VFR6cCtqejBLeTZzWnhtQnMrNWdNYmNRUERmZHFhNXdrQmtoelE9PSxpPTE1MDAw", "subType" : "00" } } }, "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "cS1Iqh7LzmAgJosLn1wLzzj269I=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } } }
mongodb-internal-1.mongodb-internal.services:27017 changed from PossiblePrimary to RSPrimary
previous hello response: { }
new hello response: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "659397206a611ffcefcf29a4" }, "counter" : 12 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "ismaster" : true, "secondary" : false, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-1.mongodb-internal.services:27017", "electionId" : { "$oid" : "7fffffff0000000000000008" }, "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T18:59:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T18:59:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T18:59:42.860Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1814, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-256", "SCRAM-SHA-1" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "cS1Iqh7LzmAgJosLn1wLzzj269I=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481182, "i" : 1 } } }
topologyChanged: 6598519ef57b3b39ea09ab40 changed from ReplicaSetNoPrimary to ReplicaSetWithPrimary

I kept hitting the script (over HTTP) every second, but no log appeared until 60 seconds elapsed:

serverHeartbeatStarted: mongodb-internal-0.mongodb-internal.services:27017 heartbeat started
serverHeartbeatStarted: mongodb-internal-1.mongodb-internal.services:27017 heartbeat started
serverHeartbeatSucceeded: mongodb-internal-0.mongodb-internal.services:27017 heartbeat succeeded after 1492µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "65939774669b4a38f7e9de0a" }, "counter" : 5 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "isWritablePrimary" : false, "secondary" : true, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-0.mongodb-internal.services:27017", "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T19:00:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T19:00:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T19:00:43.523Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1319, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "Q6EIpdYBimI5t0Toud4K/0KMLJs=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } } }
serverHeartbeatSucceeded: mongodb-internal-1.mongodb-internal.services:27017 heartbeat succeeded after 2025µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "659397206a611ffcefcf29a4" }, "counter" : 12 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "isWritablePrimary" : true, "secondary" : false, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-1.mongodb-internal.services:27017", "electionId" : { "$oid" : "7fffffff0000000000000008" }, "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T19:00:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T19:00:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T19:00:43.525Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1814, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-256", "SCRAM-SHA-1" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "Q6EIpdYBimI5t0Toud4K/0KMLJs=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481242, "i" : 1 } } }

Again, no further output until another 60 seconds elapsed:

serverHeartbeatStarted: mongodb-internal-0.mongodb-internal.services:27017 heartbeat started
serverHeartbeatStarted: mongodb-internal-1.mongodb-internal.services:27017 heartbeat started
serverHeartbeatSucceeded: mongodb-internal-0.mongodb-internal.services:27017 heartbeat succeeded after 1235µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "65939774669b4a38f7e9de0a" }, "counter" : 5 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "isWritablePrimary" : false, "secondary" : true, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-0.mongodb-internal.services:27017", "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T19:01:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T19:01:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T19:01:44.017Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1319, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "tt2pQCAJxqcLz/AVUNzuDVHPUrE=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } } }
serverHeartbeatSucceeded: mongodb-internal-1.mongodb-internal.services:27017 heartbeat succeeded after 2392µs
reply: { "helloOk" : true, "topologyVersion" : { "processId" : { "$oid" : "659397206a611ffcefcf29a4" }, "counter" : 12 }, "hosts" : [ "mongodb-internal-0.mongodb-internal.services:27017", "mongodb-internal-1.mongodb-internal.services:27017" ], "setName" : "myReplSet", "setVersion" : 6, "isWritablePrimary" : true, "secondary" : false, "primary" : "mongodb-internal-1.mongodb-internal.services:27017", "me" : "mongodb-internal-1.mongodb-internal.services:27017", "electionId" : { "$oid" : "7fffffff0000000000000008" }, "lastWrite" : { "opTime" : { "ts" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "t" : 8 }, "lastWriteDate" : { "$date" : "2024-01-05T19:01:42Z" }, "majorityOpTime" : { "ts" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "t" : 8 }, "majorityWriteDate" : { "$date" : "2024-01-05T19:01:42Z" } }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-05T19:01:44.019Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 1814, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-256", "SCRAM-SHA-1" ], "ok" : 1.0, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } }, "signature" : { "hash" : { "$binary" : { "base64" : "tt2pQCAJxqcLz/AVUNzuDVHPUrE=", "subType" : "00" } }, "keyId" : 7319174410449977349 } }, "operationTime" : { "$timestamp" : { "t" : 1704481302, "i" : 1 } } }

At this point, the following error appeared in MongoDB's logs:

{
  "t": {
    "$date": "2024-01-05T19:01:44.019+00:00"
  },
  "s": "I",
  "c": "ACCESS",
  "id": 5286307,
  "ctx": "conn1814",
  "msg": "Failed to authenticate",
  "attr": {
    "client": "127.0.0.6:39925",
    "isSpeculative": false,
    "isClusterMember": false,
    "mechanism": "",
    "user": "",
    "db": "",
    "error": "AuthenticationAbandoned: Overridden by new authentication session",
    "result": 337,
    "metrics": {
      "conversation_duration": {
        "micros": 60494130,
        "summary": {}
      }
    },
    "extraInfo": {}
  }
}

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

At this point, I am puzzled as where to look next.

  • Apparently, it's happening in the production K8s cluster extended with istio and not in local development environment.
  • At the same point, it does not happen in the very same cluster with the same MongoDB replica set when connecting from Node.js.
  • It seems to be matching exactly the time of the heartbeat request from the PHP client.
    • At the time of the first heartbeat request, no error is produced.
    • At the time of all the following requests, the error is produced.
    • Not sure, if the error is the consequence of the heartbeat request, or just a correlation.

I'm not familiar with istio, but as a proxy is it possible that the driver-to-proxy connection is terminating while the proxy-to-server connection is not? In that case, we might see the driver reestablishing a connection and sending the initial auth handshake (as it should); however, the proxy only ends up passing the auth handshake over to the server, where it's unexpected due to there already being an existing auth session.

@jmikola It doesn't seem to be this case - since the connection is established from driver-through-proxy-to-server. When I shut down the PHP process, the connection is closed. And even after the errors appear, they are all related to a single MongoDB's connection, I didn't observe new connections created as long as the process lives.

Also worth mentioning, I have it configured as TCP proxying, I don't have the experimental MongoDB support enabled in istio. That means, istio should see only TCP traffic and should not try to manipulate it in any way.

I'd also like to confirm that there's nothing in istio that would result in multiple driver connections being multiplexed across a single server connection. In other words, can we safely assume that "conn704" in the logs above corresponds to exactly one instance of the PHP driver? If the logs are split across several minutes, I think the likely suspect would be whatever process is watching a change stream (e.g. MongoDB\Collection::watch()).

AFAIK Envoy multiplexes HTTP traffic and not TCP traffic. But I don't fully understand it, so I need to learn more about it. Even though, in that case, I would expect to see troubles also for the Node.js client.

So I have a few more ideas to explore:

  • To figure out whether the heartbeat is the root cause or correlation.
  • To understand why this is not happening in the local development environment. Maybe I have missed it?
  • To understand connection pooling and multiplexing in istio / Envoy and ask for help in the istio channel.
  • To understand what is different about the way how the Node.js application connects. At the first glance, there are two connections for one "watch" process and I want to explore it more thoroughly.

If anyone has any hints on where to look next, I'd be greatful for them.

Thank you!

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

I decided to deploy a new MongoDB testing server into the same K8s cluster but not as a replica set, to better see what is happening.

I enabled debugging as follows:

test> db.setLogLevel(5, 'accessControl');
test> db.setLogLevel(5, 'network');

I used the same script as above and triggered the script every second over a HTTP request.

Here are the logs from the MongoDB server:

{"t":{"$date":"2024-01-06T10:52:35.993+00:00"},"s":"D4", "c":"NETWORK",  "id":4898000, "ctx":"listener","msg":"Setting initial ServiceExecutor context for client","attr":{"client":"conn15","useDedicatedThread":true,"canUseReserved":false}}
{"t":{"$date":"2024-01-06T10:52:35.993+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.6:49307","uuid":{"uuid":{"$uuid":"dd532a12-7a31-474c-aeae-4fa1c5794ed0"}},"connectionId":15,"connectionCount":6}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn15","msg":"client metadata","attr":{"remote":"127.0.0.6:49307","client":"conn15","doc":{"application":{"name":"test"},"driver":{"name":"mongoc / ext-mongodb:PHP / PHPLIB ","version":"1.25.2 / 1.17.2 / 1.17.0 "},"os":{"type":"Linux","name":"Alpine Linux","version":"3.19.0","architecture":"x86_64"},"platform":"PHP 8.3.0 cfg=0x03d21ea8f9 posix=200809 stdc=201710 CC=GCC 13.2.1 20231014 CFLAGS=\"\" LDFLAGS=\"\""}}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"NETWORK",  "id":22934,   "ctx":"conn15","msg":"Starting server-side compression negotiation"}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"NETWORK",  "id":22936,   "ctx":"conn15","msg":"No compressors provided"}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5859101, "ctx":"conn15","msg":"Set user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SpeculativeSaslStart"}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286203, "ctx":"conn15","msg":"Updating user name for session","attr":{"userName":{"user":"","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286200, "ctx":"conn15","msg":"Setting mechanism name","attr":{"mechanism":"SCRAM-SHA-256"}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286304, "ctx":"conn15","msg":"Determined mechanism for authentication"}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"I",  "c":"ACCESS",   "id":6788604, "ctx":"conn15","msg":"Auth metrics report","attr":{"metric":"acquireUser","micros":0}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:52:35.995+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SpeculativeSaslStart"}}
{"t":{"$date":"2024-01-06T10:52:36.059+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslContinue"}}
{"t":{"$date":"2024-01-06T10:52:36.059+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:52:36.060+00:00"},"s":"D3", "c":"ACCESS",   "id":5286203, "ctx":"conn15","msg":"Updating user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:52:36.060+00:00"},"s":"I",  "c":"ACCESS",   "id":5286306, "ctx":"conn15","msg":"Successfully authenticated","attr":{"client":"127.0.0.6:49307","isSpeculative":true,"isClusterMember":false,"mechanism":"SCRAM-SHA-256","user":"root","db":"admin","result":0,"metrics":{"conversation_duration":{"micros":64379,"summary":{"0":{"step":1,"step_total":2,"duration_micros":50},"1":{"step":2,"step_total":2,"duration_micros":32}}}},"extraInfo":{}}}
{"t":{"$date":"2024-01-06T10:52:36.060+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslContinue"}}
{"t":{"$date":"2024-01-06T10:52:36.061+00:00"},"s":"I",  "c":"NETWORK",  "id":6788700, "ctx":"conn15","msg":"Received first command on ingress connection since session start or auth handshake","attr":{"elapsedMillis":2}}

{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D3", "c":"NETWORK",  "id":22934,   "ctx":"conn15","msg":"Starting server-side compression negotiation"}
{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D3", "c":"NETWORK",  "id":22935,   "ctx":"conn15","msg":"Compression negotiation not requested by client"}
{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D3", "c":"ACCESS",   "id":5859101, "ctx":"conn15","msg":"Set user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:53:36.156+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslSupportedMechanisms"}}

{"t":{"$date":"2024-01-06T10:54:37.159+00:00"},"s":"D3", "c":"NETWORK",  "id":22934,   "ctx":"conn15","msg":"Starting server-side compression negotiation"}
{"t":{"$date":"2024-01-06T10:54:37.159+00:00"},"s":"D3", "c":"NETWORK",  "id":22935,   "ctx":"conn15","msg":"Compression negotiation not requested by client"}
{"t":{"$date":"2024-01-06T10:54:37.159+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:54:37.160+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn15","msg":"Failed to authenticate","attr":{"client":"127.0.0.6:49307","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Overridden by new authentication session","result":337,"metrics":{"conversation_duration":{"micros":61003643,"summary":{}}},"extraInfo":{}}}
{"t":{"$date":"2024-01-06T10:54:37.160+00:00"},"s":"D3", "c":"ACCESS",   "id":5859101, "ctx":"conn15","msg":"Set user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:54:37.160+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:54:37.160+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslSupportedMechanisms"}}

{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D3", "c":"NETWORK",  "id":22934,   "ctx":"conn15","msg":"Starting server-side compression negotiation"}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D3", "c":"NETWORK",  "id":22935,   "ctx":"conn15","msg":"Compression negotiation not requested by client"}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn15","msg":"Failed to authenticate","attr":{"client":"127.0.0.6:49307","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Overridden by new authentication session","result":337,"metrics":{"conversation_duration":{"micros":60485279,"summary":{}}},"extraInfo":{}}}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D3", "c":"ACCESS",   "id":5859101, "ctx":"conn15","msg":"Set user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:55:37.645+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslSupportedMechanisms"}}

{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D3", "c":"NETWORK",  "id":22934,   "ctx":"conn15","msg":"Starting server-side compression negotiation"}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D3", "c":"NETWORK",  "id":22935,   "ctx":"conn15","msg":"Compression negotiation not requested by client"}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D3", "c":"ACCESS",   "id":5286300, "ctx":"conn15","msg":"Starting authentication step","attr":{"step":"SaslSupportedMechanisms"}}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn15","msg":"Failed to authenticate","attr":{"client":"127.0.0.6:49307","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Overridden by new authentication session","result":337,"metrics":{"conversation_duration":{"micros":60370564,"summary":{}}},"extraInfo":{}}}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D3", "c":"ACCESS",   "id":5859101, "ctx":"conn15","msg":"Set user name for session","attr":{"userName":{"user":"root","db":"admin"},"oldName":{"user":"","db":""}}}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D1", "c":"ACCESS",   "id":20226,   "ctx":"conn15","msg":"Returning user from cache","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2024-01-06T10:56:38.015+00:00"},"s":"D3", "c":"ACCESS",   "id":5286301, "ctx":"conn15","msg":"Finished authentication step","attr":{"step":"SaslSupportedMechanisms"}}

Here's the timeline overview:

  • 10:52:35: the first request
  • 10:53:36: the first heartbeat
  • 10:54:37: the second heartbeat, with the "Failed to authenticate" message

All the heartbeat requests have the "oldName":{"user":"","db":""} in their log... can it be related?

Here are the logs from the SDAM logger:

[2024-01-06T10:52:35+00:00] topologyOpening: 659930f37b2ea2e7130f6580 was opened
[2024-01-06T10:52:35+00:00] topologyChanged: 659930f37b2ea2e7130f6580 changed from Unknown to Single
[2024-01-06T10:52:35+00:00] serverOpening: mongodb-test-0.mongodb-test.services:27017 was added to topology 659930f37b2ea2e7130f6580
[2024-01-06T10:52:35+00:00] serverHeartbeatStarted: mongodb-test-0.mongodb-test.services:27017 heartbeat started
[2024-01-06T10:52:35+00:00] serverHeartbeatSucceeded: mongodb-test-0.mongodb-test.services:27017 heartbeat succeeded after 5708µs
[2024-01-06T10:52:35+00:00] reply: { "helloOk" : true, "ismaster" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:52:35.995Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "speculativeAuthenticate" : { "conversationId" : 1, "done" : false, "payload" : { "$binary" : { "base64" : "cj10aEdCcDBxT1FqK25NUEFwSmR0Z01xdVYxV3hEaVFtV1V1Zys5VDhDZkFtYmJibEcvN2ovbWt2VTF6SzJ3NGk0LHM9QTBuVTNmT2F5Y01IbWhHUU50dnRQTXU2SkdWeW1JS2Z2L2w4cHc9PSxpPTE1MDAw", "subType" : "00" } } }, "ok" : 1.0 }
[2024-01-06T10:52:35+00:00] serverChanged: mongodb-test-0.mongodb-test.services:27017 changed from Unknown to Standalone
[2024-01-06T10:52:35+00:00] previous hello response: { }
[2024-01-06T10:52:35+00:00] new hello response: { "helloOk" : true, "ismaster" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:52:35.995Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0 }
[2024-01-06T10:52:35+00:00] topologyChanged: 659930f37b2ea2e7130f6580 changed from Single to Single

[2024-01-06T10:53:36+00:00] serverHeartbeatStarted: mongodb-test-0.mongodb-test.services:27017 heartbeat started
[2024-01-06T10:53:36+00:00] serverHeartbeatSucceeded: mongodb-test-0.mongodb-test.services:27017 heartbeat succeeded after 1288µs
[2024-01-06T10:53:36+00:00] reply: { "helloOk" : true, "isWritablePrimary" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:53:36.156Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0 }

[2024-01-06T10:54:37+00:00] serverHeartbeatStarted: mongodb-test-0.mongodb-test.services:27017 heartbeat started
[2024-01-06T10:54:37+00:00] serverHeartbeatSucceeded: mongodb-test-0.mongodb-test.services:27017 heartbeat succeeded after 2921µs
[2024-01-06T10:54:37+00:00] reply: { "helloOk" : true, "isWritablePrimary" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:54:37.159Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0 }

[2024-01-06T10:55:37+00:00] serverHeartbeatStarted: mongodb-test-0.mongodb-test.services:27017 heartbeat started
[2024-01-06T10:55:37+00:00] serverHeartbeatSucceeded: mongodb-test-0.mongodb-test.services:27017 heartbeat succeeded after 2081µs
[2024-01-06T10:55:37+00:00] reply: { "helloOk" : true, "isWritablePrimary" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:55:37.645Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0 }

[2024-01-06T10:56:38+00:00] serverHeartbeatStarted: mongodb-test-0.mongodb-test.services:27017 heartbeat started
[2024-01-06T10:56:38+00:00] serverHeartbeatSucceeded: mongodb-test-0.mongodb-test.services:27017 heartbeat succeeded after 853µs
[2024-01-06T10:56:38+00:00] reply: { "helloOk" : true, "isWritablePrimary" : true, "topologyVersion" : { "processId" : { "$oid" : "659926177a67a614979bb84c" }, "counter" : 0 }, "maxBsonObjectSize" : 16777216, "maxMessageSizeBytes" : 48000000, "maxWriteBatchSize" : 100000, "localTime" : { "$date" : "2024-01-06T10:56:38.015Z" }, "logicalSessionTimeoutMinutes" : 30, "connectionId" : 15, "minWireVersion" : 0, "maxWireVersion" : 21, "readOnly" : false, "saslSupportedMechs" : [ "SCRAM-SHA-1", "SCRAM-SHA-256" ], "ok" : 1.0 }

I don't see anything suspicious there.

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

I have redeployed my testing pods without istio and the issue still persists. This IMO rules istio out.

I will still try to replicate the bug in the local development environment.

from mongo-php-driver.

AlesKrajnik avatar AlesKrajnik commented on September 17, 2024

I was eventually able to replicate the issue locally.

I created a small project to demonstrate the behaviour: https://github.com/aleshaczech/mongo-php-driver-issue-1506. It is based on PHP-FPM v8.3.1 and MongoDB v7.0.4.

To observe the issue, one should do:

docker compose down && \
docker compose rm -f && \
docker compose build --pull && \
docker compose up -d && \
docker compose logs -f

...and wait 120 seconds.

Here's the (redacted) output:

...
mongo-php-driver-issue-1506-test-1     | wait-for-it.sh: mongodb:27017 is available after 6 seconds
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:43:26.472+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"172.30.0.2:52624","uuid":{"uuid":{"$uuid":"7385e68c-1ea9-44a3-98a5-bdcc6d76854b"}},"connectionId":2,"connectionCount":1}}
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:43:26.473+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn2","msg":"client metadata","attr":{"remote":"172.30.0.2:52624","client":"conn2","doc":{"application":{"name":"test"},"driver":{"name":"mongoc / ext-mongodb:PHP / PHPLIB ","version":"1.25.2 / 1.17.2 / 1.17.0 "},"os":{"type":"Linux","name":"Alpine Linux","version":"3.19.0","architecture":"x86_64"},"platform":"PHP 8.3.1 cfg=0x03d21ea8f9 posix=200809 stdc=201710 CC=GCC 13.2.1 20231014 CFLAGS=\"\" LDFLAGS=\"\""}}}
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:43:26.476+00:00"},"s":"I",  "c":"ACCESS",   "id":6788604, "ctx":"conn2","msg":"Auth metrics report","attr":{"metric":"acquireUser","micros":9}}
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:43:26.508+00:00"},"s":"I",  "c":"ACCESS",   "id":5286306, "ctx":"conn2","msg":"Successfully authenticated","attr":{"client":"172.30.0.2:52624","isSpeculative":true,"isClusterMember":false,"mechanism":"SCRAM-SHA-256","user":"root","db":"admin","result":0,"metrics":{"conversation_duration":{"micros":31915,"summary":{"0":{"step":1,"step_total":2,"duration_micros":146},"1":{"step":2,"step_total":2,"duration_micros":102}}}},"extraInfo":{}}}
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:43:26.509+00:00"},"s":"I",  "c":"NETWORK",  "id":6788700, "ctx":"conn2","msg":"Received first command on ingress connection since session start or auth handshake","attr":{"elapsedMillis":1}}
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:43:26+00:00] OK
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:43:27+00:00] OK
...
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:45:24+00:00] OK
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:45:25+00:00] OK
mongo-php-driver-issue-1506-mongodb-1  | {"t":{"$date":"2024-01-07T15:45:26.882+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn2","msg":"Failed to authenticate","attr":{"client":"172.30.0.2:52624","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Overridden by new authentication session","result":337,"metrics":{"conversation_duration":{"micros":60173698,"summary":{}}},"extraInfo":{}}}
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:45:26+00:00] OK
mongo-php-driver-issue-1506-test-1     | [2024-01-07T15:45:27+00:00] OK
...

from mongo-php-driver.

jmikola avatar jmikola commented on September 17, 2024

@aleshaczech: Thanks for the excellent debugging. I'll investigate this further and follow up once I have something to report. The 60-second delay (I can't speak for 120) corresponds to the single-threaded monitoring interval used in libmongoc, so I'm going to start by looking into whether the outgoing monitoring command is being improperly assembled (e.g. re-using a document previously prepared for the initial handshake).

from mongo-php-driver.

emilushi avatar emilushi commented on September 17, 2024

I am having the same issue on a nodejs application with mongoose. I have a similar setup as @aleshaczech, I have some processes running on background and as well a frontend that makes requests.

Looking at the logs I do see the same notification pop in almost every minute:

{"t":{"$date":"2024-02-28T10:16:54.548+00:00"},"s":"I",  "c":"ACCESS",   "id":5286307, "ctx":"conn5","msg":"Failed to authenticate","attr":{"client":"192.168.65.1:29381","isSpeculative":false,"isClusterMember":false,"mechanism":"","user":"","db":"","error":"AuthenticationAbandoned: Overridden by new authentication session","result":337,"metrics":{"conversation_duration":{"micros":60010044,"summary":{}}},"extraInfo":{}}}

Adding authMechanism=SCRAM-SHA-256 to the connection string didn't made the notification go away.
The background processes are running without any interruption and as well the client is able to make requests and connect to mongo successfully. This looks like a session overlap based on the message.

from mongo-php-driver.

alexbevi avatar alexbevi commented on September 17, 2024

@emilushi this sounds like something we'd want to investigate. Are you able to share a self-contained reproduction that produces this result? If so you could open a new ticket in JIRA under the Node project so that we can dig into this further.

from mongo-php-driver.

emilushi avatar emilushi commented on September 17, 2024

@alexbevi I will try to create one for you tomorrow.

from mongo-php-driver.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.