Comments (12)
Ah, that's good to know. I will switch to https://storage.googleapis.com/cmip6/pangeo-cmip6.json
and I will report any issues in https://github.com/pangeo-forge/cmip6-pipeline. Thanks, @jbusecke and @naomi-henderson!
from xmip.
Yikes, that looks like a nasty bug. Could you tell me a bit more about the version you are using? Did you install from conda/pip or from source?
from xmip.
Thanks, @jbusecke for the quick response! I think I am using the latest version from Github. I used the command
pip install git+https://github.com/jbusecke/cmip6_preprocessing.git --upgrade
Also, I checked the following:
In [1]: import cmip6_preprocessing
...: cmip6_preprocessing.__version__
Out[1]: '0.1.5.dev319+g20e3868.d20210215'
from xmip.
I assume this is on the pangeo google cloud deployment?
Could you paste the full code (including the catalog URL you used) here? Ill see what is going on there.
from xmip.
Sure. I followed the steps described in intake-esm tutorial and I use the same URL:
url = 'https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json'
col = intake.open_esm_datastore(url)
You can find a notebook with the relevant code here. Currently, I have found three models with the issue, but I only looked at ~10 models (out of 53).
from xmip.
Awesome. I think this is caused by the reordering of longitudes, which has caused me all kinds of trouble. I am actually thinking of getting rid of that functionality altogether (#94). Checking this now.
from xmip.
Ok I was able to reproduce the error and it seems indeed related to the longitude ordering.
Here is a quick workaround while I try to fix that bug:
### 'HadGEM3-GC31-MM'
from cmip6_preprocessing.preprocessing import (
rename_cmip6,
promote_empty_dims,
correct_coordinates,
correct_lon,
correct_units,
broadcast_lonlat,
parse_lon_lat_bounds,
sort_vertex_order,
maybe_convert_bounds_to_vertex,
maybe_convert_vertex_to_bounds,
)
def modified_preprocessing(ds):
ds = ds.copy()
# fix naming
ds = rename_cmip6(ds)
# promote empty dims to actual coordinates
ds = promote_empty_dims(ds)
# demote coordinates from data_variables
ds = correct_coordinates(ds)
# broadcast lon/lat
ds = broadcast_lonlat(ds)
# shift all lons to consistent 0-360
ds = correct_lon(ds)
# fix the units
ds = correct_units(ds)
# replace x,y with nominal lon,lat
# ds = replace_x_y_nominal_lat_lon(ds)
# rename the `bounds` according to their style (bound or vertex)
ds = parse_lon_lat_bounds(ds)
# sort verticies in a consistent manner
ds = sort_vertex_order(ds)
# convert vertex into bounds and vice versa, so both are available
ds = maybe_convert_bounds_to_vertex(ds)
ds = maybe_convert_vertex_to_bounds(ds)
return ds
for si in ['HadGEM3-GC31-MM', 'CMCC-ESM2', 'CMCC-CM2-HR4']:
cat = col.search(activity_id='CMIP', grid_label='gn', source_id=si, variable_id=['areacello'])
fig, axs = plt.subplots(ncols=2, constrained_layout=True, figsize=(20,6))
# without combined_preprocessing
ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True, 'decode_times':True})
ddict[next(iter(ddict))].areacello[0].plot(ax=axs[0])
# with combined_preprocessing
ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True, 'decode_times':True},
preprocess=modified_preprocessing)
ddict[next(iter(ddict))].areacello[0].plot(ax=axs[1])
plt.show()
Let me know if that works for you.
from xmip.
I think that works for me. Reordering of longitudes is indeed very useful but might not be essential for my analysis. Thanks a lot for looking into it so quickly!
from xmip.
A follow-up question that I'm just going to ask here (even though it is probably not the right place): I am seeing faulty data across various CMIP6 datasets obtained from
'https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json'
Those erroneous data are not related to using combined_preprocessing
but must be in the underlying dataset or introduced when downloading the data. I'm not sure where I should report these issues. Is cmip6_preprocessing the right place?
from xmip.
I am actually not sure that is the most up to date catalog. @naomi-henderson has recently refactored a lot of the cloud data.
Can you try:
import intake
col = intake.open_esm_datastore("https://storage.googleapis.com/cmip6/pangeo-cmip6.json")
col
and see if the problems persist?
Otherwise I think here is always a good spot to report but https://github.com/pangeo-forge/cmip6-pipeline might be the even more appropriate spot? @naomi-henderson, are there official guidelines for reporting on the new catalog?
from xmip.
Hmmm, I am still trying to understand why the very old NCAR version of the Pangeo CMIP6 Google Cloud's JSON file is still being used. They have a JSON file for their own collection at NCAR, but anyone using the GC collection should use the JSON file in GC. Yes, @jbusecke, your link to https://storage.googleapis.com/cmip6/pangeo-cmip6.json
is correct.
The re-organization of the GC version is now complete. If you are still having trouble, please report here: https://github.com/pangeo-forge/cmip6-pipeline
The AWS copy might still be out of sync for a few more days.
from xmip.
Hmmm, I am still trying to understand why the very old NCAR version of the Pangeo CMIP6 Google Cloud's JSON file is still being used. They have a JSON file for their own collection at NCAR, but anyone using the GC collection should use the JSON file in GC. Yes, @jbusecke, your link to https://storage.googleapis.com/cmip6/pangeo-cmip6.json is correct.
Probably partially my fault, since I put that one into the cmip6-preprocessing readme back at the cmip6-hackathon. I have to thoroughly refactor the docs and make it really clear that people need to switch!
from xmip.
Related Issues (20)
- Refine options for handling of mismatches in metric dimensions
- Installation docs mention xgcm HOT 1
- The stripe emerged after masking the ERSSTv5 SST data HOT 12
- Using `interpolate_grid_label()` to regrid data HOT 4
- Trimming grid halo as part of the preprocessing HOT 1
- Drop Python 3.7
- Missing dependency for cf-xarray
- Docs build broken
- use datatree instead of dictionary of datasets HOT 2
- Pint issue for undecoded times HOT 4
- Change license badge
- manually changing dataframe for catalog HOT 3
- Change license type in feedstock
- CI failing due to ESMF import error HOT 1
- Construct 'member_id' as part of the preprocessing
- XMIP Initial Reprocessing Does Not Work as Expected
- `replace_x_y_nominal_lat_lon` does not work for > 360 `lon` coordinates HOT 2
- `longitude` and `latitude` dimensions lost in `rename_cmip6`
- CI is failing due to upstream error in xarrrayutils HOT 1
- Eliminate `xarrayutils` dependency
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xmip.