Comments (7)
@Zeitsperre why do we need a special implementation for this? I would suppose it works on xarray ...
from clisops.
From our experience, it doesn't work right out of the box. You could try implementing a cleaner version without the handling but I think your results will be cut at the median line or the dateline, depending on the GCS of the netCDF data.
from clisops.
We might be able to use the xarray
method roll
. This allows you view the coordinate space from a different frame of reference:
http://xarray.pydata.org/en/stable/generated/xarray.DataArray.roll.html
Hence you could:
- Detect it longitude was 0 to 359 in the input dataset
- Roll it to -180 to 180 if required
- Then run the subset
from clisops.
If I recall correctly, the reason why we didn't want to use the "roll" approach was because of the high processing load needed when processing large (>250GB) NetCDF files. It made much more sense to slice up the vector file to better suit the shape of the NetCDF than to change all the NetCDF data with roll.
from clisops.
@agstephens I've been implementing this but haven't opened a PR yet as I've spotted a few issues with what I've done and I'm looking into them. Here are the changes I've made https://github.com/roocs/clisops/compare/cross_prime_meridian_fix
The issues are:
-
I'm not sure how well this will work with the test data as they have only a few longitudes e.g. 0, 250 -
clisops/clisops/utils/dataset_utils.py
Lines 27 to 29 in 3c810ba
Here I am checking whether the request is within the bounds of the dataset - however it is very strict and if I ask for a subset (0, 100) but the minimum longitude is 0.01 this tries to roll the dataset. The same issue might arise in the case of subsetting again after a first subset in a workflow. -
clisops/clisops/utils/dataset_utils.py
Line 10 in 3c810ba
I'm calculating how to roll to end up with longitude -180 to 180, so I think I need to change this to be more generic.
from clisops.
Here's what I've come up with:
-
If there's only one longitude - return the dataset as it is and don't try to roll. Don't think this solves the test issue but does look at the possible case of one longitude.
-
Change the check to something like
if (lon_min <= low or np.isclose(low, lon_min, atol=0.5)) and (lon_max >= high or np.isclose(high, lon_max, atol=0.5)):
-
Instead of rolling the first longitude value to -180, roll it to the lower bound of the requested subset longitude bounds e.g.
low, high = lon_bnds
first_element_value = low
diff, offset = calculate_offset(lon, first_element_value)
ds_roll = ds.roll(shifts={f"{lon.name}": offset}, roll_coords=False)
ds_roll.coords[lon.name] = ds_roll.coords[lon.name] + diff
return ds_roll
with
def calculate_offset(lon, first_element_value):
# get resolution of data
res = lon.values[1] - lon.values[0]
# calculate how many degrees to move by to have lon[0] of rolled subset as lower bound of request
diff = first_element_value - lon.values[0]
# work out how many elements to roll by to roll data by 1 degree
index = 1 / res
# calculate the corresponding offset needed to change data by diff
offset = int(diff * index)
return diff, offset
from clisops.
@ellesmith88: I've read through the code and it's looking great!
Here are my responses to your three points:
-
If there is only one longitude: it's a special case, which could probably be solved by:
- if actual_lon is not in lon_bnds:
- if lon_bnds[0] > actual_lon: new_lon = actual_lon + 360
- if lon_bnds[1] < actual_lon: new_lon = actual_lon - 360
- then reassign the lon coord([new_lon])
- NOTE: no need to
roll
here because data array remains the same
- if actual_lon is not in lon_bnds:
-
Change the check to something like
if (lon_min <= low or np.isclose(low, lon_min, atol=0.5)) and (lon_max >= high or np.isclose(high, lon_max, atol=0.5))
:
- looks like a great plan
- Instead of rolling the first longitude value to -180, roll it to the lower bound of the requested subset longitude bounds:
- yes, I think that is the best approach, it should deal with all cases.
Great stuff.
from clisops.
Related Issues (20)
- New release HOT 5
- Prototype adding time bounds to Datasets returned by time averaging
- Review the regrid-main branch ready for merger HOT 1
- Encountering problem with `clisops.ops.subset` related to `clisops.utils.dataset_utils.check_lon_alignment` HOT 2
- inconsistent bounds (lat_bnds etc) after subset operation HOT 7
- subset with time_components fails for day values which are not in the netcdf file HOT 2
- Notebooks are not being served on ReadTheDocs
- InvalidParameterValue in average notebook
- subset_bbox_indexer broken with latest xarray HOT 2
- Add a 'intersects' option to subset bbox and shape HOT 2
- Problem when lon_bnds set to None in subset_bbox
- use a test data generator for specfic test cases?
- Migrate to shapely 2.0 HOT 2
- shape_bbox_indexer seems fragile with some geometries HOT 1
- Support for points in shape_bbox_indexer HOT 1
- ⚠️ Nightly upstream-dev CI failed ⚠️
- Error: The defined _FillValue and missing_value are not the same HOT 5
- Problems with ATLAS datasets HOT 8
- PermissionError upon importing clisops.core in some Windows environments HOT 5
- Error when performingsubset_bbox on dataset that contains multiple timesteps and rotated coordinates HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clisops.