Describe the bug
We are trying to send push notifications with APNS from a queues job. However, we are running into issues with the following error:
Connection request timed out. This might indicate a connection deadlock in your application. If you're running long running requests, consider increasing your connection timeout. (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:213)
After which all other attempts to send another push notification returns the same error. It is almost as if a pipe is clocked up by a request that never completes and releases the connection.
Example
In my example I am dequeuing two jobs Job A and Job B. Job A is sending a push notification to a token and Job B is sending a push notification.
Job A is ran first:
2021-01-25T14:11:17.210+01:00 [ codes.vapor.application ] [ DEBUG ] Send - starting up (APNSwift/APNSwiftConnection.swift:175)
2021-01-25T14:11:17.210+01:00 [ codes.vapor.application ] [ INFO ] Send - sending (APNSwift/APNSwiftConnection.swift:206)
2021-01-25T14:11:17.211+01:00 [ codes.vapor.application ] [ DEBUG ] Request - building (APNSwift/APNSwiftRequestEncoder.swift:70)
2021-01-25T14:11:17.211+01:00 [ codes.vapor.application ] [ TRACE ] Request - built (APNSwift/APNSwiftRequestEncoder.swift:103)
2021-01-25T14:11:17.211+01:00 [ codes.vapor.application ] [ TRACE ] Request - sent (APNSwift/APNSwiftRequestEncoder.swift:107)
as we can see the request is sent.
Now comes Job B right after to run. The environment has 1 eventloop, therefore 1 connection in the pool used by APNS. Therefor this request is put on the waitlist for the other one to finish.
2021-01-25T14:11:17.870+01:00 [ codes.vapor.application ] [ DEBUG ] Connection pool exhausted on this event loop, adding request to waitlist (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:207)
Exactly 10 seconds later (the standard connectionPoolTimeout
) this error occurs:
2021-01-25T14:11:27.871+01:00 [ codes.vapor.application ] [ ERROR ] Connection request timed out. This might indicate a connection deadlock in your application. If you're running long running requests, consider increasing your connection timeout. (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:213)
All subsequent attempts to send a notification returns the same error, it is as if Job A is still using the connection, so others can't.
Potential fixes
I can't think of any other solution than to increase the connection pool timeout, but still 10 seconds is a long time. I don't understand why the response is not returned within that period, even if it's a failure. I would also expect the library to release the connection if it timed out.
Note: This error is not always reproducible, but it happens quite often (daily) and it requires us to restart our servers (to unblock the connection), so it's kind of a big issue.