Comments (8)
Well .. it looks like our account is severely throttled. This script reproduces the problem:
https://gist.github.com/kapadia/a414dca221c6c976d4c1
yielding the same error from our download stack:
Traceback (most recent call last):
File "/usr/local/bin/usgs", line 9, in <module>
load_entry_point('usgs==0.1.5', 'console_scripts', 'usgs')()
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 664, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 644, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 991, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 837, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 464, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/usgs/scripts/cli.py", line 198, in download_url
data = api.download(dataset, node, scene_ids, product)
File "/usr/local/lib/python2.7/dist-packages/usgs/api.py", line 139, in download
_check_for_usgs_error(root)
File "/usr/local/lib/python2.7/dist-packages/usgs/api.py", line 44, in _check_for_usgs_error
raise USGSError(fault_string)
usgs.USGSError: User currently has more than 10 downloads that have not been attempted in the past 10 minutes.
There are still a few scenes that squeeze through this constraint.
Since the time limit threshold is stated as 10 minutes. I've temporarily paused the stack as an effort to clear this constraint. We'll likely still have to re-factor the landsat-ingestor to request fewer download urls, or act on them more rapidly.
from landsat_ingestor.
heyo @kapadia, thanks for the info
There are still a few scenes that squeeze through this constraint.
to clarify, this means if > 10 downloads are attempted within 10 minute limit is the issues that the majority are failing? ie a few sneak through
I've temporarily paused the stack as an effort to clear this constraint.
sounds like this might exacerbate the issue if there will then be a greater backlog of imagery to download, no? trying to get a grasp on how the queue is working here (poking at the puller utils).
from landsat_ingestor.
Hiya @camillacaros -
I'm still trying to understand the exact meaning of the error posted above. My interpretation based purely on the message is that if 10 download urls have been requests AND not attempted, then USGS will return an error.
In it's current state, the landsat-ingestor uses parallel
to request 10 scenes at a time. Because we still experience a high number of 503s from USGS servers, we likely surpass the limit of untried downloads by a large margin.
Temporarily pausing the stack is an effort to clear out the 10 untried download constraint that USGS is now tracking.
from landsat_ingestor.
@kapadia You might find turning the parallel's down to 5 or 6 mostly clears the "more than 10 outstanding download urls".
from landsat_ingestor.
@warmerdam Yes that's one change to be made, however, (I think) the 503s that we continue to encounter contribute to the limit. This would result in amounting many download urls that have not been marked as accessed.
from landsat_ingestor.
To better understand the rate limiting, I hit USGS servers in various ways. The gist posted above helps confirm that a user cannot have more than 10 download urls that have not been accessed (e.g. you can't just hoard urls, you gotta use them). We crossed that line pretty quickly by running 10 concurrent downloads, while also continuing to get 503s.
After testing a few levels of concurrency, it seems it's best to stick with 2 - 3 concurrent downloads. This will have an impact on the job time, though the extent is not yet clear since the current job is working through the back log.
The stack is running again, and does not have any signs of the rate limiting error or 503s.
from landsat_ingestor.
(e.g. you can't just hoard urls, you gotta use them). We crossed that line pretty quickly by running 10 concurrent downloads, while also continuing to get 503s
ah interesting. thanks for the dig & fix @kapadia !
from landsat_ingestor.
np C-dawg.
from landsat_ingestor.
Related Issues (17)
- index.html rendering thumbnail for scenes without proper bands HOT 1
- reset connections HOT 2
- Update Pusher to set content_disposition on TIFFs to attachment
- usgs request failure - perhaps we need retry? HOT 4
- reprocess scenes without .ovr and tiling HOT 1
- Fewer requests to the USGS auth endpoint HOT 2
- corrupt tar files reprocessed indefinately HOT 1
- Fewer requests to the USGS download-url endpoint HOT 1
- Fix unbound var HOT 1
- scene_info date handling triggering exception HOT 3
- Reacquire reprocessed 2015 scenes HOT 5
- Include dayOrNight in scene_list? HOT 2
- Duplicates in scene_list HOT 2
- Scenes missing HOT 1
- Scene list is missing ProductId HOT 2
- Receiving old scenes on SNS topic HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from landsat_ingestor.