ecmwf / cdsapi Goto Github PK

Python API to access the Copernicus Climate Data Store (CDS)

License: Apache License 2.0

Python 99.29% Dockerfile 0.71%

cdsapi's Introduction

Install

Install via pip with:

$ pip install cdsapi

Configure

Get your user ID (UID) and API key from the CDS portal at the address https://cds.climate.copernicus.eu/user and write it into the configuration file, so it looks like:

$ cat ~/.cdsapirc
url: https://cds.climate.copernicus.eu/api/v2
key: <UID>:<API key>

Remember to agree to the Terms and Conditions of every dataset that you intend to download.

Test

Perform a small test retrieve of ERA5 data:

$ python
>>> import cdsapi
>>> cds = cdsapi.Client()
>>> cds.retrieve('reanalysis-era5-pressure-levels', {
           "variable": "temperature",
           "pressure_level": "1000",
           "product_type": "reanalysis",
           "date": "2017-12-01/2017-12-31",
           "time": "12:00",
           "format": "grib"
       }, 'download.grib')
>>>

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

In applying this licence, ECMWF does not waive the privileges and immunities granted to it by virtue of its status as an intergovernmental organisation nor does it submit to any jurisdiction.

cdsapi's People

Contributors

Stargazers

Watchers

Forkers

jblarsen volueinsight nicholas512 shawndegroot raytl haytastan rabernat markelg valtze aagarwal93 johncharrington iheprojects tgrandje fxi rolinzcy alexamici francesconazzaro xiaojingdounaodan rkouznetsov thomascjohnson scallions sgoult kandersolar cug-hydro hectornieto kesci konanast si-bremer-enbw alekseibuinyi lukejones123 keul isabella232 cofinoa jamesvarndell al00014 tomaugspurger weilin2018 bopen rikuge mulroony gbiavati eddycmwf akeeman grigaut photonbit lizhuoq evteevakb mwtoews malmans2 matrss wendellwilson changpeihe shadchin maresb sahar-github

cdsapi's Issues

'Overload' object has no attribute 'timeout'

I have been using the cdsapi for a long time now, but since today, I get the error message.

2022-02-24 16:02:11,614 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
2022-02-24 16:02:11,646 INFO Request is queued
2022-02-24 16:02:12,675 INFO Request is failed
2022-02-24 16:02:12,675 ERROR Message: an internal error occurred processing your request
2022-02-24 16:02:12,690 ERROR Reason:  'Overload' object has no attribute 'timeout'
2022-02-24 16:02:12,690 ERROR   Traceback (most recent call last):
2022-02-24 16:02:12,690 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 51, in handle_request
2022-02-24 16:02:12,690 ERROR       timeout = proc.timeout
2022-02-24 16:02:12,690 ERROR   AttributeError: 'Overload' object has no attribute 'timeout'

I am using cdsapi as follows (with many consecutive / parallel calls):

API_URL = "https://cds.climate.copernicus.eu/api/v2"

c = cdsapi.Client(key=api_key, url=API_URL)`
dataset_name = 'reanalysis-era5-single-levels'
download_date = datetime.datetime(1979,1,7,7,0)
cds_params =  {'product_type': 'reanalysis', 'format': 'netcdf', 'variable': 'total_precipitation', 'year': '1979', 'month': '01', 'day': '07', 'time': ['07:00']}
download_path = "{some-path-on-disc}.nc"
try:
    c.retrieve(dataset_name, cds_params, download_path)
    return True

except Exception as err:
    if any(
        [
            str(err).startswith("no data"),
            "the request you have submitted is not valid. There is no data matching your request." in str(err),
        ]
    ):
        logger.info(f"No data found for {download_date}")
        return False
    else:
        raise err

Is this a bug in the api or or am I doing something wrong?

Custom FailedRequest and NoDataMatchingRequest exceptions

For the evaluation of CAMS2-40 model results, withing the CAMS2-83 project we download 22 different model results every day.

When requesting "cams-europe-air-quality-forecasts" data that is missing, or has not yet been delivered cdsapi raises an Exception with the following message:

the request you have submitted is not valid. There is no data matching your request. Check that you have specified the correct fields and values..

Currently, our application that handles the downloading needs to check the exception message
in order to find out if the data is not yet available or if there is a different kind of error.

    try:
        client.retrieve("cams-europe-air-quality-forecasts", request, tmp_path)
    except Exception as e:
        if "There is no data matching your request" in str(e):
            logger.info(f"No {model} data to be found")
            return
        logger.error(e)
        raise FailedToDownload(f"failed to download {model}") from e

As far as I can tell, the Exception we're trying to handle comes from lines 506 to 509 of apy.py

cdsapi/cdsapi/api.py

Lines 494 to 509 in f3b94a9

    
           if reply["state"] in ("failed",): 
        
               self.error("Message: %s", reply["error"].get("message")) 
        
               self.error("Reason:  %s", reply["error"].get("reason")) 
        
               for n in ( 
        
                   reply.get("error", {}) 
        
                   .get("context", {}) 
        
                   .get("traceback", "") 
        
                   .split("\n") 
        
               ): 
        
                   if n.strip() == "" and not self.full_stack: 
        
                       break 
        
                   self.error("  %s", n) 
        
               raise Exception( 
        
                   "%s. %s." 
        
                   % (reply["error"].get("message"), reply["error"].get("reason")) 
        
               )

It would be of great help if instead of raising an plain Exception cdsapi would rise a custom exception, say FailedRequest. That way we could limit the scope of the except block to handle only FailedRequest exceptions.

Even better would be if cdsapi would raise custom NoDataMatchingRequest exception when reply["error"].get("reason", "").startswith("There is no data matching your request") or similar.

Query for data availability and update date

Is your feature request related to a problem? Please describe.

Some datasets are not updated on a strict schedule (specifically the "reanalysis-era5-single-levels-monthly-means"). I see no means of finding out what, if anything, is newer than the last time I checked, which means I have to re-request the data regularly until it changes (and detecting that there has been a change can be tricky and convoluted).

Describe the solution you'd like

It would be extremely useful if there was some sort of programmatic way to check the "last modified" time of a dataset. Even if it was just a single modification time for the whole dataset as a whole would reduce the amount of data bandwidth being consumed.

Ideally, it would be nice if there was a way to query for the "last modification time" as a time series matched with the record dimension of the dataset, so that we can find out what range of time are newer than what we have currently. It'll also help with automatically going back and reprocessing revised analyses.

Describe alternatives you've considered

No response

Additional context

No response

Organisation

Atmospheric and Environmental Research, Inc.

metadata information from cdsapi ?

Hi and thanks for the very useful API,

As others pointed out, some more fine-grained documentation about is missing about accepted parameters and available data. Or course, the online Data documentation provides much of that missing information, but it does not match the cdsapi one to one because variable and other names (model, scenario...) are spelled differently in their full text and API form, so one has to go through the full interface to get the appropriate CDS API command. That works, but this could be further simplified in my opinion.

I was wondering is that could be built as a specific request, for each dataset? For instance, cdsapi.getchoices(dataset, field, **kwargs) such as:

cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model', variable='2m_temperature')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model', variable='2m_temperature', scenario='rcp_8_5')

These request aim at the model field in the projections-cmip5-monthly-single-levels dataset. More specifically, for each of the three lines:

all models in that dataset
all models in that dataset with the variable 2m_temperature
all models in that dataset with the variable 2m_temperature and the scenario rcp_8_5

Similarly, one could request available scenarios and variables:

cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'variable')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'scenario')

(one could also have cdsapi.getfields(dataset, required=False) for a list of all fields associated with one dataset, or optionally a list of all required fields)

That would clearly come in handy in offline scripts.

ImportError: No module named cdsapi

Greeting wizards;
I'm using code below to download ERA5 data:

===========================================
#!/usr/bin/env python
import cdsapi
c = cdsapi.Client()

c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type':'reanalysis',
'format':'grib',
'variable':[
'10m_u_component_of_wind','10m_v_component_of_wind','2m_dewpoint_temperature',
'2m_temperature','land_sea_mask','mean_sea_level_pressure',
'sea_ice_cover','sea_surface_temperature','skin_temperature',
'snow_depth','soil_temperature_level_1','soil_temperature_level_2',
'soil_temperature_level_3','soil_temperature_level_4','surface_pressure',
'volumetric_soil_water_layer_1','volumetric_soil_water_layer_2','volumetric_soil_water_layer_3',
'volumetric_soil_water_layer_4'
],
'date':'20170101/20170331',
'area':'60/80/10/150',
'time':[
'00:00','01:00','02:00',
'03:00','04:00','05:00',
'06:00','07:00','08:00',
'09:00','10:00','11:00',
'12:00'
]
},
'ERA5-20170101-20170331-sl.grib')

but there is an error saying:

File "GetERA5-20170101-20170331-sl.py", line 2, in
import cdsapi
ImportError: No module named cdsapi

I've installed cdsapi via "pip install cdsapi" and it's installed correctly. Any idea what's my problem?

download function error:'module' object is not callable

Hi, it is appreciated to get your help.

My environment: Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)] on win32
the code as follows:

import cdsapi

c = cdsapi.Client()

r = c.retrieve(
"reanalysis-era5-pressure-levels",
{
"variable": "temperature",
"pressure_level": "1000",
"product_type": "reanalysis",
"year": "2008",
"month": "01",
"day": "01",
"time": "12:00",
"format": "grib"
},
)
r.download("download.grib")

It is successful to retrieve the requests and was shown in https://cds.climate.copernicus.eu/cdsapp#!/yourrequests

However, when I try to download the result by using:

r.download("download.grib")

It shows the Errors:
r.download("download.grib")
2020-11-26 15:00:14,179 INFO Downloading http://136.156.132.201/cache-compute-0004/cache/data3/adaptor.mars.internal-1606349148.740683-9340-19-64a26b8e-f9c7-4a4d-995f-e58359f0fb19.grib to download.grib (2M)
2020-11-26 15:00:14,184 DEBUG Starting new HTTP connection (1): 136.156.132.201
2020-11-26 15:00:14,816 DEBUG http://136.156.132.201:80 "GET /cache-compute-0004/cache/data3/adaptor.mars.internal-1606349148.740683-9340-19-64a26b8e-f9c7-4a4d-995f-e58359f0fb19.grib HTTP/1.1" 200 2076600
Traceback (most recent call last):

File "", line 1, in
r.download("download.grib")

File "D:\Anaconda3\lib\site-packages\cdsapi\api.py", line 167, in download
target)

File "D:\Anaconda3\lib\site-packages\cdsapi\api.py", line 125, in _download
leave=False,

TypeError: 'module' object is not callable

What is the problem and how to fix it out? Thanks

(short) loss of connection corrupts data

When trying to download fapar LAI data from the CDS using this package, I encountered a (short) loss of internet connection.

This caused the following error:

2023-02-28 11:50:27,112 ERROR Download incomplete, downloaded 1337900919 byte(s) out of 15881569751
2023-02-28 11:50:27,113 WARNING Sleeping 10 seconds
2023-02-28 11:50:37,140 WARNING Resuming download at byte 1337900919

However, the resulting .zip file has contains one netcdf file with a size of 1 KB, while the other files are all ~400-500 MB.
This corrupted netcdf file is not openable.

It would be better if the request would completely fail, rather than corrupting an (unknown) number of files, which is only visible once extracting the .zip file and looking at the file sizes.

Code of my request

    cds_client.retrieve(
        "satellite-lai-fapar",
        {
            "format": "zip",
            "variable": "lai",
            "satellite": [
                "proba", "spot",
            ],
            "sensor": "vgt",
            "horizontal_resolution": "1km",
            "product_version": "V3",
            "year": f"{year}",
            "month": [
                "01", "02", "03",
                "04", "05", "06",
                "07", "08", "09",
                "10", "11", "12",
            ],
            "nominal_day": [
                "10", "20", "28",
                "30", "31",
            ],
        },
        f"LAI_fapar_vgt_{year}.zip"
    )

An internal error occurred processing your request Reason: cannot import name 'make_traceback'

The code I'm using

import cdsapi
c = cdsapi.Client()
c.retrieve('cems-glofas-historical',{'format': 'zip','variable': 'Upstream area',},'download.zip')

Traceback

020-11-02 07:13:18,908 INFO Welcome to the CDS
2020-11-02 07:13:18,908 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/cems-glofas-historical
2020-11-02 07:13:19,726 INFO Request is queued
2020-11-02 07:13:20,914 INFO Request is failed
2020-11-02 07:13:20,914 ERROR Message: an internal error occurred processing your request
2020-11-02 07:13:20,914 ERROR Reason:  cannot import name 'make_traceback'
2020-11-02 07:13:20,914 ERROR   Traceback (most recent call last):
2020-11-02 07:13:20,914 ERROR     File "/opt/cds/adaptor/cdshandlers/adaptorlib/adaptorrequesthandler.py", line 66, in handle_request
2020-11-02 07:13:20,914 ERROR       return super().handle_request(cdsinf, data_request, self.config)
2020-11-02 07:13:20,915 ERROR     File "/opt/cds/cdsinf/python/lib/cdsinf/runner/requesthandler.py", line 113, in handle_request
2020-11-02 07:13:20,915 ERROR       return handler(cdsinf, request, config)
2020-11-02 07:13:20,915 ERROR     File "/opt/cds/adaptor/cdshandlers/url/handler.py", line 39, in handle_retrieve
2020-11-02 07:13:20,915 ERROR       dis = self._fetch_list(data_request["specific"])
2020-11-02 07:13:20,915 ERROR     File "/opt/cds/adaptor/cdshandlers/url/handler.py", line 91, in _fetch_list
2020-11-02 07:13:20,915 ERROR       pattern, valid_request, max_errors, patternmatchon, filename)
2020-11-02 07:13:20,915 ERROR     File "/opt/cds/adaptor/cdshandlers/adaptorlib/tools.py", line 141, in substitute
2020-11-02 07:13:20,916 ERROR       processed = template.render(r)
2020-11-02 07:13:20,916 ERROR     File "/usr/local/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render
2020-11-02 07:13:20,916 ERROR     File "/usr/local/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render
2020-11-02 07:13:20,916 ERROR       block_start_string,
2020-11-02 07:13:20,916 ERROR     File "/usr/local/lib/python3.6/site-packages/jinja2/environment.py", line 773, in handle_exception
2020-11-02 07:13:20,916 ERROR       except TemplateSyntaxError as e:
2020-11-02 07:13:20,916 ERROR   ImportError: cannot import name 'make_traceback'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.7/site-packages/cdsapi/api.py", line 317, in retrieve
    result = self._api('%s/resources/%s' % (self.url, name), request, 'POST')
  File "/opt/conda/lib/python3.7/site-packages/cdsapi/api.py", line 458, in _api
    raise Exception("%s. %s." % (reply['error'].get('message'), reply['error'].get('reason')))

Best practices when sending multiple requests

As it stands now, I usually make several requests to split in chunks the datasets I fetch through the API for less memory-intensive processing.

However, there is no way to send a list of requests (I am just using for loops to do so now) and my requests usually take ~20 min in queue and 1-2 min downloading, which makes fetching the datasets a quite time-consuming process.

Is there any way to speed this up?

Exception: Missing/incomplete configuration

Exception Traceback (most recent call last)
in
1 import cdsapi
2
----> 3 c = cdsapi.Client()
4
5 c.retrieve(

~\anaconda3\lib\site-packages\cdsapi\api.py in init(self, url, key, quiet, debug, verify, timeout, progress, full_stack, delete, retry_max, sleep_max, wait_until_complete, info_callback, warning_callback, error_callback, debug_callback, metadata, forget, session)
299
300 if url is None or key is None or key is None:
--> 301 raise Exception("Missing/incomplete configuration file: %s" % (dotrc))
302
303 self.url = url

Exception: Missing/incomplete configuration file: C:\Users\HP/.cdsapirc

Conda-forge package is out of date

Would it be possible to update the conda-forge cdsapi package? Currently it is version 0.2.7 while pip version is 0.3.1

Thanks in advance!

Sending multiple requests

Hello all,
This is a question rather than issue, but is there a way to send requests into the queue (i.e. as from the website) from the python api?
Thank you!

Key error with wait_until_complete

Hello,

When I use wait_until_complete=False, I get the following error:

2022-02-02 12:29:22,550 INFO Welcome to the CDS
2022-02-02 12:29:22,551 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-land
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    c.retrieve(
  File "users\Python310\lib\site-packages\cdsapi\api.py", line 350, in retrieve
    result.download(target)
  File "users\Python310\lib\site-packages\cdsapi\api.py", line 173, in download
    return self._download(self.location, self.content_length, target)
  File "users\Python310\lib\site-packages\cdsapi\api.py", line 181, in location
    return urljoin(self._url, self.reply["location"])
KeyError: 'location'

Does someone know how to solve this issue ?

CMIP6 Unable to parse the time values entered

I want to download CMIP6 a subset data between two dates using the cdsapi but didn't have success.

I tried to download with this configuration:

c.retrieve(
    'projections-cmip6',
    {
        'temporal_resolution': 'daily',
        'experiment': 'ssp5_8_5',
        'level': 'single_levels',
        'variable': 'precipitation',
        'model': 'cmcc_esm2',
        'date': '2020-01-01/2050-01-01',
        'format': 'zip',
    },
    'download.zip'

When I run the code below, got this error:

2021-11-30 15:06:50,278 INFO Welcome to the CDS
2021-11-30 15:06:50,278 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/projections-cmip6
2021-11-30 15:06:50,340 INFO Request is queued
2021-11-30 15:06:51,397 INFO Request is running
2021-11-30 15:07:03,855 INFO Request is failed
2021-11-30 15:07:03,855 ERROR Message: an internal error occurred processing your request
2021-11-30 15:07:03,855 ERROR Reason:  Process error: Unable to parse the time values entered
2021-11-30 15:07:03,856 ERROR   Traceback (most recent call last):
2021-11-30 15:07:03,856 ERROR     File "/usr/local/lib/python3.6/site-packages/rooki/results.py", line 33, in url
2021-11-30 15:07:03,856 ERROR       return self.response.get()[0]
2021-11-30 15:07:03,856 ERROR     File "/usr/local/lib/python3.6/site-packages/birdy/client/outputs.py", line 30, in get
2021-11-30 15:07:03,856 ERROR       raise ProcessFailed("Sorry, process failed.")
2021-11-30 15:07:03,856 ERROR   birdy.exceptions.ProcessFailed: Sorry, process failed.
Traceback (most recent call last):
  File "/PROJECTES/PUBLICDATA/adaptation/ejemplo.py", line 83, in <module>
    retrieve_data(target, model, period, date=requestDates)
  File "/PROJECTES/PUBLICDATA/adaptation/ejemplo.py", line 34, in retrieve_data
    c.retrieve(
  File "/home/isglobal.lan/rmendez/.conda/envs/tools/lib/python3.9/site-packages/cdsapi/api.py", line 348, in retrieve
    result = self._api("%s/resources/%s" % (self.url, name), request, "POST")
  File "/home/isglobal.lan/rmendez/.conda/envs/tools/lib/python3.9/site-packages/cdsapi/api.py", line 506, in _api
    raise Exception(
Exception: an internal error occurred processing your request. Process error: Unable to parse the time values entered.

Also tried to use the toolbox request and got this error:

Traceback (most recent call last):
  File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 55, in handle_request
    result = cached(context.method, proc, context, context.args, context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
    result = proc(context, *context.args, **context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 118, in __call__
    return p(*args, **kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 59, in __call__
    return self.proc(context, *args, **kwargs)
  File "/home/cds/cdsservices/services/retrieve.py", line 198, in execute
    remote = context.call_resource(name, request, update_specific_metadata={'app_scope': 'adaptor'})
  File "/opt/cdstoolbox/cdscompute/cdscompute/context.py", line 307, in call_resource
    return c.call_resource(service, *args, **kwargs).value
  File "/opt/cdstoolbox/cdsworkflows/cdsworkflows/future.py", line 76, in value
    raise self._result
cdsworkflows.error.ClientError: {'traceback': 'Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/rooki/results.py", line 33, in url
    return self.response.get()[0]
  File "/usr/local/lib/python3.6/site-packages/birdy/client/outputs.py", line 30, in get
    raise ProcessFailed("Sorry, process failed.")
birdy.exceptions.ProcessFailed: Sorry, process failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cds/cdsservices/services/esgf_wps/requests.py", line 60, in process
    results = response.download_urls()
  File "/usr/local/lib/python3.6/site-packages/rooki/results.py", line 78, in download_urls
    return [url.text for url in self.doc.find_all("metaurl")]
  File "/usr/local/lib/python3.6/site-packages/rooki/results.py", line 47, in doc
    self._doc = BeautifulSoup(self.xml, "xml")
  File "/usr/local/lib/python3.6/site-packages/rooki/results.py", line 41, in xml
    raise Exception(f"Could not download metalink document. {e}")
Exception: Could not download metalink document. Sorry, process failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/cds/cdsinf/python/lib/cdsinf/runner/dispatcher.py", line 616, in handle_request
    context.get("method_config", None))
  File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 55, in handle_request
    result = cached(context.method, proc, context, context.args, context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
    result = proc(context, *context.args, **context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 118, in __call__
    return p(*args, **kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 59, in __call__
    return self.proc(context, *args, **kwargs)
  File "/home/cds/cdsservices/services/esgf_wps/__init__.py", line 41, in execute
    result = requests.process(context, request_facets, request, **process_kwargs)
  File "/home/cds/cdsservices/services/esgf_wps/requests.py", line 64, in process
    raise Exception(message)
Exception: Process error: Unable to parse the time values entered
'}

If I didn't select the date parameter (the whole available temporal range), the API works and the file is downloaded.

Many thanks

Exception hierarchy

Is your feature request related to a problem? Please describe.

Inside cdsapi, all exceptions are instances of Exception, making the exception handling of the users of the library tricky, depending on making matches with the message of the exception.

Describe the solution you'd like

Instead, an exception hierarchy could be added to the project. Last week I have been working with cdsapi for a project and this week I might be collaborating this the main repo making PRs if it is alive to incorporate some improvements, like the exceptions hierarchy:

e5a88de

Describe alternatives you've considered

No response

Additional context

No response

Organisation

No response

Conda-forge package is out of date

Would it be possible to update the conda-forge cdsapi package? Currently it is version 0.1.4.
https://github.com/conda-forge/cdsapi-feedstock

I use conda for all my package management - having the latest version on conda-forge would be useful for me, so that I don't have to have two competing package managers.

ERA5 river discharge `zip` or `tar.gz` download are invalid

Attempt to extract file downloaded by the script below i get An error occurred while loading the archive

import cdsapi

c = cdsapi.Client()

c.retrieve(
    'cems-glofas-historical',
    {
        'format':'zip',
        'variable':'River discharge',
        'dataset':'Consolidated reanalysis',
        'version':'2.1',
        'year':'2019',
        'month':'01',
        'day':'29'
    },
    'download.zip')

ERA5 reanalysis data retrieval is unexpectedly slow using CDS API (takes 45 minutes to download 2.1MB data)

Basically, the size of data downloaded via CDS API is 2.1MB, but it takes 45 minutes. This is unusual because I can download data of more than 2GB when using the ecmwfapi for that amount of time.

Below is my code for retrieving ERA5 data:

code start

import calendar
import cdsapi
server = cdsapi.Client()

def retrieve_era5():
"""
A function to demonstrate how to iterate efficiently over several years and months etc
for a particular era5_request.
Change the variables below to adapt the iteration to your needs.
You can use the variable 'target' to organise the requested data in files as you wish.
In the example below the data are organised in files per month. (eg "era5_daily_201510.grb")
"""

yearStart = 1998
yearEnd = 1998
monthStart = 1
monthEnd = 1
for year in range(yearStart, yearEnd + 1):
    Year = str(year)
    for month in range(monthStart, monthEnd + 1):
        Month = str(month)
        # startDate = '%04d-%02d-%02d' % (year, month, 1)
        numberOfDays = calendar.monthrange(year, month)[1]
        Days = [str(x) for x in list(range(1, numberOfDays + 1))]
        # lastDate = '%04d-%02d-%02d' % (year, month, numberOfDays)
        target = "era5_1h_daily_0to70S_100Eto120W_025025_quv_%04d%02d.nc" % (year, month)
        # requestDates = (startDate + "/" + lastDate)
        era5_request(Year, Month, Days, target)

def era5_request(Year, Month, Days, target):
"""
An ERA era5 request for analysis pressure level data.
Change the keywords below to adapt it to your needs.
(eg to add or to remove levels, parameters, times etc)
"""
server.retrieve('reanalysis-era5-pressure-levels',
{'product_type': 'reanalysis',
'format': 'netcdf',
'variable': ['specific_humidity', 'u_component_of_wind', 'v_component_of_wind'],
'year': Year,
'month': Month,
'day': Days,
'pressure_level': ['300', '350', '400','450', '500', '550', '600', '650', '700','750', '775', '800','825', '850', '875','900', '925', '950','975', '1000'],
'time': ['00:00', '01:00', '02:00','03:00', '04:00', '05:00','06:00', '07:00', '08:00','09:00', '10:00', '11:00','12:00', '13:00', '14:00','15:00', '16:00', '17:00','18:00', '19:00', '20:00','21:00', '22:00', '23:00'],
'area': [0, 100, -1, 101],},
target)

if name == 'main':
retrieve_era5()

code end

This code is just to do things small at first, try to download specific_humidity, u_component_of_wind, v_component_of_wind from 1998-1-1 to 1998-1-31, temperatioal resolution: 1 hour; spatioanl resolution: 0.25° x 0.25°; pressure levels: 300 hpa to 1000 hpa. Area :1°S to 0, 100°E to 101°E.

Below is the picture showing the results of running this code:

Below is the picture showing that downloading data by ecmwfapi, basically, 23minutes retrieving 2.18GB data. I'm not sure what is going here.

Could any give me some advice? Many thanks.

Hi Eddy / Guys

          Hi Eddy  / Guys

I'm new in the community, .... would appreciate your support

I want to download daily precipitation data but got the below error, seems the parameter for the module "Project" shall be different than TBD (Note: I run the "test" to retrieve ERA5 data and it works well ....)

luis_h@LAPTOP-3D44S1T1:~$ python3 script.py
Traceback (most recent call last):
File "/home/luis_h/script.py", line 12, in
"project": TBD,
NameError: name 'TBD' is not defined

Below the script content ( taken from the C3S guide "Daily statistics calculated from ERA5 data" )

import cdsapi
c = cdsapi.Client()
MONTHS = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"]
for month in MONTHS:
result = c.service(
"tool.toolbox.orchestrator.workflow",
params={
"realm": "c3s",
"project": TBD,
"version": "master",
"kwargs": {
"dataset": "reanalysis-era5-single-levels",
"product_type": "reanalysis",
"variable": "total_precipitation",
"statistic": "daily_mean",
"year": "2020",
"month": "month",
"time_zone": "UTC+00:0",
"frequency": "1-hourly",
"grid": "0.25/0.25",
"area": {"lat": [-20, -10], "lon": [40, 30]}
},
"workflow_name": "daily_prec"
})
c.download(result)

Originally posted by @LuisHernandoG in #28 (comment)

cdsapi error: invalid syntax

Hi,

I installed cdsapi by following the page (https://cds.climate.copernicus.eu/api-how-to#install-the-cds-api-key) to download ERA5 on Linux, then tried to import it. However, I had the following error.
I used to be able to download ERA5, but I can't now. I use Miniconda for python. I tried python 2 and python 3. Neither works. Does anyone know why? Thanks in advance. -DJ

import cdsapi

File "/home/dj/Google_Drive/ROFS/py/get_era.py", line 63, in get_era5_month
import cdsapi
File "/home/dj/software/miniconda3/envs/roms_cf2/lib/python2.7/site-packages/cdsapi/init.py", line 21, in
from . import api
File "/home/dj/software/miniconda3/envs/roms_cf2/lib/python2.7/site-packages/cdsapi/api.py", line 335
def service(self, name, *args, mimic_ui=False, **kwargs):
^
SyntaxError: invalid syntax

is arg 'area' always implemented?

For which data sets is (or is not) the argument 'area' implemented?

For example this call appears to return a zipped netCDF of the full geographic extent, meaning the item limit is quickly passed.

import cdsapi
c = cdsapi.Client()
c.retrieve("derived-utci-historical",
{"format" : "zip", # GRIB OR NETCDF NOT IMPLEMENTED?
"variable": "Mean radiant temperature",
'product_type': 'Consolidated dataset',
"day" : '15',
"month" : '08',
"year" : '2016',
'area': [44.0, -10.0, 35.0, 5.0], # NOT IMPLEMENTED?
'grid': [0.25, 0.25] # IMPLEMENTED?
})

Which release channel is canonical?

I maintain the cdsapi package in Debian [1]. Until now, I've been importing tarballs from the PyPI page [2]. I would prefer basing the package on this git repository instead, but I notice that the 0.3.1 git tag and the cdsapi-0.3.1.tar.gz tarball from PyPI do not have identical content. Which release channel is considered canonical and official?

Thanks.

[1] https://tracker.debian.org/pkg/python-cdsapi

[2] https://pypi.org/project/cdsapi/

KeyError: 'kwargs' when submitting a workflow with workflow method

If I try to submit a workflow with the following code I get an error:

import cdsapi
cds = cdsapi.Client()
result = cds.workflow(code)

I get the following error:

 Request is failed
 Message: an internal error occurred processing your request
 Reason:  'kwargs'
   Traceback (most recent call last):
     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 49, in handle_request
       result = cached(context.method, proc, context, context.args, context.kwargs)
     File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
       result = proc(context, *context.args, **context.kwargs)
     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 118, in __call__
       return p(*args, **kwargs)
     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 59, in __call__
       return self.proc(context, *args, **kwargs)
     File "/home/cds/cdsservices/services/run_workflow.py", line 12, in execute
       params['kwargs'].pop('_timestamp', None)
   KeyError: 'kwargs'

The error doesn't depend on the code.
I have been getting this error since the mimic_ui function was added. It seems that the service method API has changed.

cdsapi.Client changes logging level of external loggers

The cdsapi.Client constructor includes a call to logging.basicConfig which modifies the logging behavior for loggers unrelated to cdsapi. Here's an example:

import logging
import cdsapi

# external logger initially at WARNING level:
my_logger = logging.getLogger('my_logger')
print(my_logger)  # <Logger my_logger (WARNING)>
my_logger.info('this is not printed')
client = cdsapi.Client()
# external logger now set to INFO:
print(my_logger)  # <Logger my_logger (INFO)>
my_logger.info('this is printed')

output:

<Logger my_logger (WARNING)>
<Logger my_logger (INFO)>
2021-08-20 07:52:24,840 INFO this is printed

And in fact it modifies the log level of loggers from a bunch of other packages; check out the contents of logging.root.manager.loggerDict before and after creating a Client object and you'll see many of them switch from WARNING to INFO as well.

I have a patch ready to fix this and will submit a PR shortly.

Tag releases?

If this repository is the official one for the releases made on PyPI [1], it would be very helpful for my work maintaining the CDSAPI Debian package [2] if you could tag the releases in git. Using with the PyPI release tarballs works, but it's always simpler if one can use upstream's git.

(If this is not the official CDSAPI repository, feel free to close and ignore this issue).

[1] https://pypi.org/project/cdsapi/

[2] https://tracker.debian.org/pkg/python-cdsapi

Where is the full API documentation?

Sorry for the silly issue, but I couldn't find it on the web.

I am looking for the full API documentation for the CDS API. I would like to read about all the different options, the allowed values, etc. Apologies if this is obvious, but I couldn't figure it out.

Thanks for providing this package.

Infinite while loop if state reply is unknown

cdsapi/cdsapi/api.py

Line 511 in f3b94a9

raise Exception("Unknown API state [%s]" % (reply["state"],))

There should be a while breake condition to exit the while loop for an unknown reply state label. Apparently there are some cases for which the reply is different from the common flags, since in the last 3 months I've found my requests stuck in an infinite loop.

Error: "'tuple' object is not callable" upon calling c.retrieve() method

cdsapi version: 0.2.7

.cdsapirc file format:
url: https://cds.climate.copernicus.eu/api/v2
key: UID:KEY

Can't figure out what's going wrong, I've seen that others are also having the same issue but haven't been able to find the cause or a fix.

ERA 5 Wave Spectra

Hello

What is the most recent data available ?

In fact from a date how long do I have to wait to have the data reanalyzed ?

Exception: Resource cams-global-reanalysis-eac4 not found

Trying to retrieve data via API results in an exception (see title).
The code used was generated by the CAMS UI located here: https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-global-reanalysis-eac4?tab=form

Code generated:

import cdsapi

c = cdsapi.Client()

c.retrieve(
    'cams-global-reanalysis-eac4',
    {
        'variable': 'temperature',
        'pressure_level': '1',
        'model_level': '1',
        'date': '2003-01-01/2020-12-31',
        'time': [
            '00:00', '03:00', '06:00',
            '09:00', '12:00', '15:00',
            '18:00', '21:00',
        ],
        'format': 'grib',
    },
    'download.grib')

All terms/conditions accepted and .cdsapirc set according to instructions laid out here (for Mac): https://ads.atmosphere.copernicus.eu/api-how-to

UPDATE: that error is very misleading. The issue was that I need to use the ADS URL and key (rather than the CDS version), as provided here: https://ads.atmosphere.copernicus.eu/api-how-to
It's also very confusing that the instructions on that ADS page explicitly refer to CDS...

401 Client Error: Unauthorized for url: ...

I'm trying to retrieve a dataset from a Google Collab notebook, unsuccessfully so far, although I did apply all instructions.

import cdsapi

uploaded = files.upload() #to upload the .cdsapirc
!cp .cdsapirc ../root/

c = cdsapi.Client()

c.retrieve(
    'insitu-glaciers-elevation-mass',
    {
        'product_type': [
            'elevation_change', 'mass_balance',
        ],
        'file_version': '20181103',
        'variable': 'all',
        'format': 'zip',
    },
    'download.zip')

insitu_glaciers_elevation_mass = pd.read_csv('download.zip', compression='zip')

This yields the following error:

2020-04-15 13:18:23,712 INFO Welcome to the CDS
2020-04-15 13:18:23,713 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/insitu-glaciers-elevation-mass
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/cdsapi/api.py in _api(self, url, request, method)
    388         try:
--> 389             result.raise_for_status()
    390             reply = result.json()

3 frames
/usr/local/lib/python3.6/dist-packages/requests/models.py in raise_for_status(self)
    939         if http_error_msg:
--> 940             raise HTTPError(http_error_msg, response=self)
    941 

HTTPError: 401 Client Error: Unauthorized for url: https://cds.climate.copernicus.eu/api/v2/resources/insitu-glaciers-elevation-mass

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-12-ceb78a244bb3> in <module>()
     16         'format': 'zip',
     17     },
---> 18     'download.zip')
     19 
     20 insitu_glaciers_elevation_mass = pd.read_csv('download.zip', compression='zip')

/usr/local/lib/python3.6/dist-packages/cdsapi/api.py in retrieve(self, name, request, target)
    315 
    316     def retrieve(self, name, request, target=None):
--> 317         result = self._api('%s/resources/%s' % (self.url, name), request, 'POST')
    318         if target is not None:
    319             result.download(target)

/usr/local/lib/python3.6/dist-packages/cdsapi/api.py in _api(self, url, request, method)
    408                                  "of '%s' at %s" % (t['title'], t['url']))
    409                     error = '. '.join(e)
--> 410                 raise Exception(error)
    411             else:
    412                 raise

Exception: <html>
<head><title>401 Authorization Required</title></head>
<body>
<center><h1>401 Authorization Required</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>

I did first accept the terms & conditions:

What am I missing?

Feature request: async support

A simple use of aiohttp for requesting and aiofiles for writing to disk would greatly improve the usability of cdsapi when launching several parallel requests.

Max retries exceeded with url, timed out.

I started getting this error:

Recovering from connection error [HTTPSConnectionPool(host='cds.climate.copernicus.eu', port=443): Max retries exceeded with url: /api/v2/tasks/46f7822a-449a-499f-bbcf-c3cdc2b77420 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f8b9faf7400>, 'Connection to cds.climate.copernicus.eu timed out. (connect timeout=None)'))], attemps 0 of 500 Retrying in 120 seconds

I've tried adding a timeout.

Requesting Ocean and Atmosphere variables from ERA5-hourly as netcdf leads to ocean variables incorrectly represented on atmospheric grid.

I submitted a request for a regional box that roughly corresponds to the gulf of Mexico, for several variables of interest over the period of development of Katrina.

The surface atmosphere variables are correctly gridded.

The surface ocean variables only fill the top left quarter of the grid.

It looks like the lower resolution ocean variables have been added to the higher resolution atmospheric grid.

Request:

import cdsapi

c = cdsapi.Client()
c.retrieve(
        "reanalysis-era5-single-levels",
        {
            "product_type": "reanalysis",
            "format": "netcdf",
            "variable": [
                "10m_u_component_of_wind",
                "10m_v_component_of_wind",
                "2m_dewpoint_temperature",
                "2m_temperature",
                "mean_sea_level_pressure",
                "mean_wave_direction",
                "mean_wave_period",
                "sea_surface_temperature",
                "significant_height_of_combined_wind_waves_and_swell",
                "surface_pressure",
                "total_precipitation",
            ],
            "year": "2005",
            "month": "08",
            "day": [
                "20",
                "21",
                "22",
                "23",
                "24",
                "25",
                "26",
                "27",
                "28",
                "29",
                "30",
                "31",
            ],
            "time": [
                "00:00",
                "01:00",
                "02:00",
                "03:00",
                "04:00",
                "05:00",
                "06:00",
                "07:00",
                "08:00",
                "09:00",
                "10:00",
                "11:00",
                "12:00",
                "13:00",
                "14:00",
                "15:00",
                "16:00",
                "17:00",
                "18:00",
                "19:00",
                "20:00",
                "21:00",
                "22:00",
                "23:00",
            ],
            "area": [35, -100, 15, -80],
        },
        "katrina_era5.nc")

SSL errors without ability to ignore

I'm currently getting the following error while using the cdsapi and i can't find a way to ignore ssl errors.

WARNING:cdsapi:Recovering from connection error [HTTPSConnectionPool(host='136.156.132.235', port=443): Max retries exceeded with url: /cache-compute-0000/cache/data4/adaptor.mars.internal-1608196857.3120284-11250-14-fe304c3c-021b-42ef-824b-9d38937d9338.nc (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))], attemps 0 of 500
WARNING:cdsapi:Retrying in 120 seconds
INFO:cdsapi:Retrying now...

Can we get a way to silence and ignore ssl errors?

Setting `retry_max` to `0` raises error

To avoid retrying downloads (as they corrupt my request #66 ), I attempted to set retry_max=0 in the client definition. However, this causes cdsapi to fail completely.

I believe the offending line is here:

cdsapi/cdsapi/api.py

Lines 106 to 109 in 85351e4

    
           tries = 0 
        
           headers = None 
        
           while tries < self.retry_max:

It seems that there is an off-by-one mistake here: I would expect that when retry_max is set to 0, that a single attempt would be made. Currently, if retry_max=n, the number of retries is n-1. The comparison should therefore be:

    while tries <= self.retry_max:

My stack trace:

2023-02-28 13:11:00,405 INFO Welcome to the CDS
2023-02-28 13:11:00,406 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/satellite-lai-fapar
Traceback (most recent call last):
  File "usr\venvs\ecoextreml\lib\site-packages\cdsapi\api.py", line 427, in _api
    result.raise_for_status()
AttributeError: 'NoneType' object has no attribute 'raise_for_status'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "usr\venvs\ecoextreml\lib\site-packages\cdsapi\api.py", line 433, in _api
    reply = result.json()
AttributeError: 'NoneType' object has no attribute 'json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "usr\download_scripts\download_FAPAR_LAI.py", line 46, in <module>
    get_data(c, year=2010)
  File "usr\download_scripts\download_FAPAR_LAI.py", line 20, in get_data
    cds_client.retrieve(
  File "usr\venvs\ecoextreml\lib\site-packages\cdsapi\api.py", line 348, in retrieve
    result = self._api("%s/resources/%s" % (self.url, name), request, "POST")
  File "usr\venvs\ecoextreml\lib\site-packages\cdsapi\api.py", line 435, in _api
    reply = dict(message=result.text)
AttributeError: 'NoneType' object has no attribute 'text'

Allow to change path of configuration file

Is your feature request related to a problem? Please describe.

According to documentation the configuation file has to be at a fixed place:

Paste the 2 line code into a %USERPROFILE%.cdsapirc file, where in your windows environment, %USERPROFILE% is usually located at C:\Users\Username folder).

I want to use cdsapi with a portable version of python and therefore would prefer if I could adapt the path.

Please provide an option to do so (or document it if already available)

wait_until_complete=False is confusing

When I use wait_until_complete=False, I expect the api to launch a product request without waiting for it to be completed. However, when I use it, no new request appears in https://cds.climate.copernicus.eu/cdsapp#!/yourrequests. Could you explain to me what this parameter really does ?

KeyboardInterrupt results in truncated netcdf files

ubuntu 18.04
python 3.6.9
cdsapi 0.3.0

Retrieving a file, hitting CTRL-C (exception KeyboardInterrupt) during the downloading phase results in a truncated file. This can be problematic as netcdf routines will happily read and return arrays from a truncated file. If the user forces an exit, perhaps the partially downloaded file should be deleted?

Load into memory without saving to disk

Is there any way to load the requested data into memory without saving to disk? I am trying to load the data as netcdf and do something like this:

import cdsapi

c = cdsapi.Client()

params =  {
    'product_type': 'reanalysis',
    'variable': 'total_precipitation',
    'year': '2019',
    'month': '01',
    'day': '01',
    'time': '00:00',
    'format': 'netcdf',
    'grid':[1.0, 1.0],
     } 

data = c.retrieve('reanalysis-era5-single-levels',  params)
ds = data.load() # doesn't exist, but could load data into memory

# process the data below here

I know .download() will save to disk. but I am curious if there is any way to implement something like a .load() method which would load the data into memory as an xarray dataset. This way one could load the data and process it in one script. The current solution is to download the data and just throw away the original dataset after processing, but I feel like this process could be streamlined.

I looked at the download method but couldn't figure out a way to implement some type of load method. Any thoughts?

Thanks for making this package. This has saved me so much time!

Separation of retrieve and download

If retrieve is called without a target filename the method will execute a retrieve request to CDS and poll it until the request is 'completed' or 'failed' and return a Result object after that. The download method on this object can then be called to actually download the data file.

But for retrieve requests which take a very long time to complete (> 1 day) this workflow may not be suitable (on all systems). I would therefore propose an option to have a slightly different workflow:

The retrieve request returns a Result object after the first ("robust") request without waiting for the state of the reply to be 'completed' or 'failed'.
A new instance method on the Result class named e.g. 'query_state', 'update', 'update_state' or something like this is added.

This would allow users more control over the retrieval and download process. In our system it is for example not optimal to have very long running processes. We can then split the processing up into one task which executes the retrieve request and puts the Result object information in our key value store. After that we can then regularly dispatch a task which queries the state of the CDS task and downloads the resulting dataset when ready. Just like you do internally in cdsapi now.

So in summary this suggestion will not change anything about how cdsapi works now. But it will give users more flexibility in integrating cdsapi in data processing pipelines. Please let me know what you think of the above. Please see PR below for details.

Using cdsapi on Python Anywhere

What maintenance does this project need?

I am trying to use cdsapi on Python Anywhere, so I asked them to include this library on the allowed list, which according to their e-mail, they added :)

Now I am having difficulties to find on cdsapi documentation how to setup the proxy according to Python Anywhere documentation:

In order to make a connection to a site that is on the allowlist, you will need to connect through our proxy server. This is an HTTP proxy at proxy.server:3128. Most Python libraries recognise and use the setting that we supply (for instance, with requests, you don't need to do anything special), others need to be specifically configured to use the proxy (check the documentation of the library to find out how) and some don't work through the proxy at all (for instance, any library that uses httplib2 on Python 3 e.g. twilio)

Organisation

No response

Downloading available request

There is no method in the API to download an available request. Could someone help?

Convert IP addresses in URLs to domain names

Our outgoing firewall rules are based on domain names but CDS download URLs have IP addresses. It was straightforward to modifiy api.py to do the conversion. Would it be possible to include that as a feature?

How to get the latest date available for the dataset?

Is your feature request related to a problem? Please describe.

When attempting to automatically download data (e.g., ERA5) using the cdsapi, I consistently find it necessary to implement a try...except block to handle potential program failures when the data exceeds the latest available date. Nevertheless, I can only retrieve this information from error logging. Is it feasible to provide an interface that allows me to determine when to stop the download process?

Describe the solution you'd like

implement an API that displays the latest available data for the dataset.

Describe alternatives you've considered

No response

Additional context

No response

Organisation

No response

Error decoding the grib file when reading/streaming the data from S3 bucket

I am generating the data in grib format using the client and then saving the generated file in the AWS S3 bucket. Now, when I am trying to read the file from the S3 bucket, I am getting an error.

Client code

c = cdsapi.Client(url=URL, key=KEY)
c.retrieve(
    'reanalysis-era5-land',
    {
        'variable': 'soil_temperature_level_4',
        'year': year,
        'month': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
        ],
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
            '31',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'area': [
            np.floor(location[1]*100)/100,
            np.floor(location[0]*100)/100,
            np.ceil(location[1]*100)/100,
            np.ceil(location[0]*100)/100,
        ],
        'format': FORMAT,
    })

Code for reading the file from S3

def get_grib_from_s3(grib_file):
        s3_client = boto3.client('s3')
        bucket = <my_bucket_name>
        try:
            resp = s3_client.get_object(
                Bucket=bucket,
                Key=grib_file
            )
            return resp
        except s3_client.exceptions.NoSuchKey as e:
            raise HTTPException(
                status_code=404,
                detail=e
            )

GRIB_FILE_NAME = <filename>.grib
a = get_grib_from_s3(GRIB_FILE_NAME)

and then I try to decode the streaming value as follows:

a['Body'].read().decode('utf-8')

When I do this, I get the following error:

Traceback (most recent call last):
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/uvicorn/protocols/http/h11_impl.py", line 396, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/fastapi/applications.py", line 199, in __call__
    await super().__call__(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/middleware/cors.py", line 78, in __call__
    await self.app(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/routing.py", line 580, in __call__
    await route.handle(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/routing.py", line 241, in handle
    await self.app(scope, receive, send)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/routing.py", line 52, in app
    response = await func(request)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/fastapi/routing.py", line 202, in app
    dependant=dependant, values=values, is_coroutine=is_coroutine
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/fastapi/routing.py", line 150, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/energycortex/.local/share/virtualenvs/wärmenachfragetool-i62m6NeI/lib/python3.7/site-packages/starlette/concurrency.py", line 40, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "./app/api/v1/slp.py", line 37, in get_warm_slp
    temps = HeatSLPFunctions.load_temperature_data(db, year, location)
  File "./app/services/heat_demand_funcs.py", line 185, in load_temperature_data
    print(a['Body'].read().decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 11: invalid start byte

InsecureRequestWarning -- how to best resolve this warning/error?

When I make a request for a dataset via the Copernicus Climate Data Store API I get a series of InsecureRequestWarning messages such as the below:

/home/james/miniconda3/envs/climate/lib/python3.7/site-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)

The request never returns with the requested dataset, and instead fails with a no data response error:

2019-05-11 14:14:13,199 INFO Request is failed
2019-05-11 14:14:13,201 ERROR Message: no data is available within your requested subset
2019-05-11 14:14:13,206 ERROR Reason:  Request returned no data
2019-05-11 14:14:13,210 ERROR   Traceback (most recent call last):
2019-05-11 14:14:13,214 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 49, in handle_request
2019-05-11 14:14:13,218 ERROR       result = cached(context.method, proc, context, *context.args, **context.kwargs)
2019-05-11 14:14:13,222 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
2019-05-11 14:14:13,226 ERROR       result = proc(context, *context.args, **context.kwargs)
2019-05-11 14:14:13,228 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 110, in __call__
2019-05-11 14:14:13,230 ERROR       return p(*args, **kwargs)
2019-05-11 14:14:13,233 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 59, in __call__
2019-05-11 14:14:13,234 ERROR       return self.proc(context, *args, **kwargs)
2019-05-11 14:14:13,236 ERROR     File "/home/cds/cdsservices/services/mars.py", line 338, in internal
2019-05-11 14:14:13,237 ERROR       return mars(context, request, **kwargs)
2019-05-11 14:14:13,237 ERROR     File "/home/cds/cdsservices/services/mars.py", line 61, in mars
2019-05-11 14:14:13,238 ERROR       raise NoDataException("Request returned no data", '')
2019-05-11 14:14:13,239 ERROR   cdsinf.exceptions.NoDataException: Request returned no data

The Python code I'm using to make this request:

days = list(map(lambda x: str(x).zfill(2), range(1, 32)))
months = list(map(lambda x: str(x).zfill(2), range(1, 13)))
years = list(map(lambda x: str(x).zfill(2), range(1980, 2019)))
api_request = {
    'area': '6.002/33.501/-5.202/42.283',
    'day': days,
    'format': 'netcdf',
    'month': months,
    'product_type': 'reanalysis',
    'time': ['00:00'],
    'variable': 'total_precipitation',
    'year': years
}
cds_client = cdsapi.Client()
cds_client.retrieve(
    'reanalysis-era5-pressure-levels-monthly-means',
    api_request,
    'kenya_rainfall_.nc'
)

I have followed the instructions in the link from the warning message and added the following code, to no avail:

import certifi
import urllib3
http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where()
)

Please advise if this is a known issue with a resolution (sorry if so, I didn't find anything when searching for an answer to this), or if there are suggestions for a workaround. Thanks in advance.

How to download CAMS data in 0.125 grid resolution?

Dear colleagues, is there a way to access CAMS NRT data on a different raster grid resolution then 0.4? For instance, in 0.125 grid resolution?
I used to download these data using ECMWFDataServer from ecmwfapi module. But now I recieve the message that now I should use cdsapi.
But looking the documentation, says that the grid parameter is not available. So I wonder how I can access this data in a differnet resolution?
Thanks in davance

Exception: Resource reanalysis-era5-complete not found

I am trying to use cdsapi to download model level data for ERA5 and am receiving the error: 'Exception: Resource reanalysis-era5-complete not found'

I have modified the download script presented here (https://confluence.ecmwf.int/display/CKB/How+to+download+ERA5), so I am not sure what the issue is. My scripts work for other datasets (pressure level data, surface data).

Confusing error message for invalid key

If the API key (by accident) is set to an invalid value:

$ cat .cdsapirc
url: https://cds.climate.copernicus.eu/api/v2
key: dummy

and we run the test example:

>>> import cdsapi
>>> cds = cdsapi.Client()
>>> cds.retrieve('reanalysis-era5-pressure-levels', {
           "variable": "temperature",
           "pressure_level": "1000",
           "product_type": "reanalysis",
           "date": "2017-12-01/2017-12-31",
           "time": "12:00",
           "format": "grib"
       }, 'download.grib')

we get an error message which is not very informative:

2019-05-15 08:54:26,421 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-pressure-levels
Traceback (most recent call last):
  File "<stdin>", line 8, in <module>
  File "/home/jbl/.local/lib/python3.6/site-packages/cdsapi/api.py", line 230, in retrieve
    result = self._api('%s/resources/%s' % (self.url, name), request)
  File "/home/jbl/.local/lib/python3.6/site-packages/cdsapi/api.py", line 245, in _api
    result = self.robust(session.post)(url, json=request, verify=self.verify)
  File "/home/jbl/.local/lib/python3.6/site-packages/cdsapi/api.py", line 362, in wrapped
    r = call(*args, **kwargs)
  File "/home/jbl/.local/lib/python3.6/site-packages/requests/sessions.py", line 581, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/home/jbl/.local/lib/python3.6/site-packages/requests/sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "/home/jbl/.local/lib/python3.6/site-packages/requests/sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/home/jbl/.local/lib/python3.6/site-packages/requests/models.py", line 317, in prepare
    self.prepare_auth(auth, url)
  File "/home/jbl/.local/lib/python3.6/site-packages/requests/models.py", line 548, in prepare_auth
    r = auth(self)
TypeError: 'tuple' object is not callable

	if reply["state"] in ("failed",):
	self.error("Message: %s", reply["error"].get("message"))
	self.error("Reason: %s", reply["error"].get("reason"))
	for n in (
	reply.get("error", {})
	.get("context", {})
	.get("traceback", "")
	.split("\n")
	):
	if n.strip() == "" and not self.full_stack:
	break
	self.error(" %s", n)
	raise Exception(
	"%s. %s."
	% (reply["error"].get("message"), reply["error"].get("reason"))
	)

ecmwf / cdsapi Goto Github PK

cdsapi's Introduction

Install

Configure

Test

License

cdsapi's People

Contributors

Stargazers

Watchers

Forkers

cdsapi's Issues

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

code start

code end

Is your feature request related to a problem? Please describe.

What maintenance does this project need?

Organisation

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

Recommend Projects

Recommend Topics

Recommend Org