Comments (10)
what we could do is split the behavior between keyword-argument data_vars
and positional argument number 1:
def __init__(self, vars=None, /, coords=None, attrs=None, *, data_vars=None):
...
where vars
promotes to coordinates, but data_vars
doesn't (and you can't pass vars
and data_vars
at the same time).
Then, at some point, we deprecate the positional argument or all positionals (not sure how that part should look like exactly, though).
from xarray.
what we could do is split the behavior between keyword-argument
data_vars
and positional argument number 1:def __init__(self, vars=None, /, coords=None, attrs=None, *, data_vars=None): ...where
vars
promotes to coordinates, butdata_vars
doesn't (and you can't passvars
anddata_vars
at the same time).
I like this idea. It makes the current behavior more explicit and makes it clear how to opt-in to the new behavior.
from xarray.
I think that comes explicitly from netCDF User's Guide and the CF Conventions("coordinate varible")
Though reading now, I guess it's saying all coordinate variables must be 1D with the same name as the dim, not necessarily the converse.
from xarray.
Auto-promoting dimension data variables as dimension coordinates when creating a new Dataset has been indeed the expected behavior so far.
I'm not sure what best we should do, though. On one hand, xr.Dataset(data_vars={"x": [0]})
creating a "x" coordinate looks definitely odd to me too. On the other hand, xr.Dataset({"x": [0]})
not creating a dimension coordinate would feel quite disruptive to me.
from xarray.
Thanks for that context @dopplershift !
The way I see it there are two consistent behaviours:
- all variables that are 1D with the same name as the dim must be coordinate variables, and both
Dataset.__init__
and.expand_dims()
should therefore coerce data variables to coordinate variables if necessary, - variables that are 1D with the same name as the dim do not have to be coordinate variables, and we shouldn't coerce anywhere
On the other hand,
xr.Dataset({"x": [0]})
not creating a dimension coordinate would feel quite disruptive to me.
Yeah this would break quite a lot of user code...
from xarray.
On the other hand,
xr.Dataset({"x": [0]})
not creating a dimension coordinate would feel quite disruptive to me.
Is it possible to imagine a deprecation cycle for this? One which detects this input pattern and raises a warning telling you to use xr.Dataset(coords={"x": [0]})
instead?
from xarray.
That looks like a nice solution @keewis, except maybe xr.Dataset(data_vars={"x": [0]})
not creating an "x" coordinate, which would be a breaking change (in theory) and which would require another deprecation cycle?
from xarray.
true. But I guess people passing data_vars
explicitly would be less likely to expect creating a coordinate variable from it, so we could shorten the cycle a bit (like, warn for 2-3 releases then switch to the new behavior).
from xarray.
We decided to start by raising PendingDeprecationWarning
to slowly discourage this usage.
from xarray.
#8979 implements this suggestion. The same PendingDeprecationWarning
is issued if a variable passed to data_vars
is auto-promoted to a coordinate. Eventually we would have to break that auto-promotion and remove the vars
positional-only argument to complete the deprecation cycle.
from xarray.
Related Issues (20)
- update `to_netcdf` docstring to list support for explicit CDF5 writes HOT 4
- (i)loc slicer specialization for convenient slicing by dimension label as `.loc('dim_name')[:n]`
- Improving performance of open_datatree HOT 4
- Why does xr.apply_ufunc support numpy/dask.arrays?
- Enhancement of xarray.Dataset.from_dataframe HOT 5
- Stricter check for .array attribute
- Release? HOT 5
- The numpy.array_api namespace has been removed in numpy 2.0 HOT 2
- Documentation Request: Clarity for __matmul__ operator HOT 3
- ```_FillValue``` and ```missing_value``` attributes get removed when using ```open_dataset``` HOT 4
- Potential regression in Dataset.from_dataframe() not preserving timezone HOT 6
- interpolate using quadratic returns nan HOT 1
- Map block reduction HOT 2
- Strings in coordinates may be truncated when saving concatenated rasters to zarr HOT 2
- Can't call open_mfdataset without creating chunked dask arrays HOT 3
- `DataSet.chunk` and `DataArray.chunk` handling object coordinates differently
- Regression/#1840: decoding to `float64` instead of `float32` HOT 8
- Passing in DataArray into `np.linspace` breaks with Numpy 2
- Square Logos HOT 9
- weighted polyfit HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xarray.