Environment Details: Using google-cloud-storage==1.23.0 and 1.

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Storage: TimeoutGuard raises TimeoutException even though upload successful about python-storage HOT 10 CLOSED

googleapis commented on August 10, 2024

Storage: TimeoutGuard raises TimeoutException even though upload successful

from python-storage.

Comments (10)

plamut commented on August 10, 2024 1

@mcsimps2 Thanks for the quick feedback, it seems that the fix could be pretty straightforward. It might even make sense to rename timeout to total_timeout in order to better distinguish it from the timeout passed to the requests transport...

The issue can probably also be reproduced by artificially throttling one's internet connection, but if there is a need to access a big file, I will let you know. Or for verifying a patch when ready as a part of a PR review, that would also be cool!

from python-storage.

mcsimps2 commented on August 10, 2024

I also realize that it would be wise to split such large files into a multi-part upload, but the issue still remains, especially for slow internet connections.

from python-storage.

IlyaFaer commented on August 10, 2024

@plamut, PTAL

from python-storage.

plamut commented on August 10, 2024

(disclaimer: speaking from the top of my head, would need to check the code to confirm)

Not a definite answer, but the motivation for adding a TimeoutGuard was to support customizable total timeouts for the underlying operations, when these operations can potentially involve multiple requests.

For example, sometimes there is a Future object sitting on top that wants to timeout after X seconds, even though each individual underlying request would take less than X seconds. This was needed to support some use case scenarios in BigQuery.

With that said, it is possible that this addition broke the expected semantics in other use cases, e.g. where timeout is expected to represent a per-request timeout.

If there is no explicit timeout given (timeout=None), then the TimeoutGuard should act as a no-op. I think the following paragraph is the key:

File uploads for a couple minutes, exceeding the default timeout of 60/61 seconds (...) that the TimeGuard uses.

IIRC, this is what happens - the library user provides no explicit timeout, but unlike some other libraries, this library also defines a default timeout (60, 61), which is passed to the TimeoutGuard. However, that timeout is probably meant as a timeout for each requests request, and not the total timeout. As the requests docs explain:

timeout is not a time limit on the entire response download; rather, an exception is raised if the server has not issued a response for timeout seconds (more precisely, if no bytes have been received on the underlying socket for timeout seconds). If no timeout is specified explicitly, requests do not time out.

@mcsimps2 Does this sound plausible? From the issue description it seems to me that your application does not use explicit timeouts, is that correct?

(in which case the reported timeout error is indeed a bug - thanks for reporting it!)

Update: Classified as P1, as this can break large file uploads, yet no obvious and reasonable workaround exists (monkeypatching does not count).

from python-storage.

mcsimps2 commented on August 10, 2024

Ah, that does make sense - I see how that's useful if you have multiple requests going on and need to cancel an operation mid-way through the sequence of requests if the "clock" timeout was exceeded.

You're correct, I do not set any explicit timeouts when calling the code, so any non-null timeout populated to the TimeoutGuard must be coming from a library default.

However, that timeout is probably meant as a timeout for each requests request, and not the total timeout.

As a quick check (and for debugging purposes only), I can verify that setting the timeout passed to TimeoutGuard to None, while still passing the read and connect timeouts to requests per the timeout parameter, does indeed stop this issue from happening in the AuthorizedSession.request function and allows me to successfully complete a large/slow upload operation.

So it seems that, perhaps, specifying a separate timeout to pass to TimeoutGuard in addition to the one we want to pass to the requests library would remediate this problem, although that might be a naive fix since I have only looked at a very limited portion of the codebase, and it would be one more method argument to worry about.

I would be happy to run any tests that would be helpful since I have access to big files and a slow network.

from python-storage.

plamut commented on August 10, 2024

@mcsimps2 FYI, I opened a pull request in the google-auth library.

The timeout parameter no longer affects the TimeoutGuard, and instead a separate max_allowed_time parameter is added to control it. Its value is None by default, making the TimeoutGuard a no-op by default as well.

If you wish, you can already try it out by using the google-auth version from the PR branch, but otherwise there will probably be a new release soon anyway, and the Storage lib will just need a version bump of the google-auth dependency.

from python-storage.

mcsimps2 commented on August 10, 2024

Awesome, thank you for pushing this forward so quickly - much appreciated! This will help a lot, I'm looking forward to the next release

from python-storage.

plamut commented on August 10, 2024

@mcsimps2 The most recent google-auth release google-auth==1.11.0 contains the change that separates transport timeouts from the total time wall clock timeout. The latter is None by default, meaning that TimoutGuard should not be raising unnecessary Timeout errors anymore.

Until a new storage lib release is made, you could try manually pinning google-auth to the abovementioned version, resolving this issue.

Also, thanks for the very good and detailed report, it made diagnosing the issue much faster. 👍

from python-storage.

crwilcox commented on August 10, 2024

@plamut is all of the work done for this (minus a release)?

from python-storage.

plamut commented on August 10, 2024

@crwilcox Yes, the bug no longer occurs if google-auth>=1.11.0 is used (the bugfix was done in the latter awhile ago).

from python-storage.

Storage: TimeoutGuard raises TimeoutException even though upload successful about python-storage HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent