Code Monkey home page Code Monkey logo

xarray-einstats's Introduction

xarray-einstats

Documentation Status Run tests codecov PyPI Conda Version DOI

Stats, linear algebra and einops for xarray

Installation

To install, run

(.venv) $ pip install xarray-einstats

See the docs for more extensive install instructions.

Overview

As stated in their website:

xarray makes working with multi-dimensional labeled arrays simple, efficient and fun!

The code is often more verbose, but it is generally because it is clearer and thus less error prone and more intuitive. Here are some examples of such trade-off where we believe the increased clarity is worth the extra characters:

numpy xarray
a[2, 5] da.sel(drug="paracetamol", subject=5)
a.mean(axis=(0, 1)) da.mean(dim=("chain", "draw"))
a.reshape((-1, 10)) da.stack(sample=("chain", "draw"))
a.transpose(2, 0, 1) da.transpose("drug", "chain", "draw")

In some other cases however, using xarray can result in overly verbose code that often also becomes less clear. xarray_einstats provides wrappers around some numpy and scipy functions (mostly numpy.linalg and scipy.stats) and around einops with an api and features adapted to xarray. Continue at the getting started page.

Contributing

xarray-einstats is in active development and all types of contributions are welcome! See the contributing guide for details on how to contribute.

Relevant links

Similar projects

Here we list some similar projects we know of. Note that all of them are complementary and don't overlap:

Cite xarray-einstats

If you use this software, please cite it using the following template and the version specific DOI provided by Zenodo. Click on the badge to go to the Zenodo page and select the DOI corresponding to the version you used DOI

  • Oriol Abril-Pla. (2022). arviz-devs/xarray-einstats <version>. Zenodo. <version_doi>

or in bibtex format:

@software{xarray_einstats2022,
  author       = {Abril-Pla, Oriol},
  title        = {{xarray-einstats}},
  year         = 2022,
  url          = {https://github.com/arviz-devs/xarray-einstats}
  publisher    = {Zenodo},
  version      = {<version>},
  doi          = {<version_doi>},
}

xarray-einstats's People

Contributors

aloctavodia avatar humitos avatar oriolabril avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xarray-einstats's Issues

Add logsumexp wrapper

I think it is the only function in scipy.special worth wrapping, so it might be worth adding a misc module for this and other possible "loose ends" functions to wrap

Support preliz distributions?

PreliZ distributions are similar to scipy ones in many aspects, but sometimes it is more convenient to work with them.

Some differences are that both continues and discrete PreliZ's distributions has methods pdf and logpdf (no pmf or logpmf). And some methods like logcdf or isf are missing.

display_np_arrays_as_images in (ported) einops tutorial missing

I installed xarray-einstats from conda-forge and tried to reproduce einops tutorial (ported) from the documentation

This posed several challenges as it was not clear from the information in the documentation where to find

  • utils module for importing the display_np_arrays_as_images function from
  • einops-image.zarr file containing the batch of images usied in the tutorial

After digging around in this repository I managed to find the necessary files way down in the docs / source / tutorials folder

Maybe it is worth to add some explanation to this tutorial where / how to get those particular files onto your system (without having to do a full development from the github repository ;)

Readme formatting issues?

Not sure whether this is a helpful comment or already known — feel free to close — but it looks like the readme might have some formatting issues:

image

Tests fail: AttributeError: partially initialized module 'einops' has no attribute '_backends'

collected 140 items / 2 errors                                                                                                                                                               

=========================================================================================== ERRORS ===========================================================================================
_________________________________________________________________ ERROR collecting src/xarray_einstats/tests/test_einops.py __________________________________________________________________
tests/test_einops.py:6: in <module>
    from xarray_einstats.einops import raw_rearrange, raw_reduce, rearrange, reduce, translate_pattern
einops.py:9: in <module>
    import einops
einops.py:407: in <module>
    class DaskBackend(einops._backends.AbstractBackend):  # pylint: disable=protected-access
E   AttributeError: partially initialized module 'einops' has no attribute '_backends' (most likely due to a circular import)
__________________________________________________________________ ERROR collecting src/xarray_einstats/tests/test_numba.py __________________________________________________________________
tests/test_numba.py:6: in <module>
    from xarray_einstats.numba import histogram
numba.py:2: in <module>
    import numba
numba.py:9: in <module>
    @numba.guvectorize(
E   AttributeError: partially initialized module 'numba' has no attribute 'guvectorize' (most likely due to a circular import)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 2 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
===================================================================================== 2 errors in 1.83s ======================================================================================
*** Error code 2

OS: FreeBSD 13.1

Many tests fail: AttributeError: 'DataArray' object has no attribute 'drop_indexes'

__________________________________________________________________________________ TestHistogram.test_histogram[dims2] __________________________________________________________________________________

self = <tests.test_numba.TestHistogram object at 0x9fbf948e0>
data = <xarray.Dataset>
Dimensions:  (plot_dim: 20, chain: 4, draw: 10, team: 6, match: 12)
Coordinates:
  * team     (team) ...7 0.8981 0.07392 ... 1.025 0.2036 3.072
    score    (chain, draw, match) int64 1 1 1 0 1 1 3 1 1 ... 2 1 2 2 0 0 2 0 1
dims = ('chain', 'team')

    @pytest.mark.parametrize("dims", ("team", ["team"], ("chain", "team")))
    def test_histogram(self, data, dims):
>       out = histogram(data["mu"], dims)

tests/test_numba.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../stage/usr/local/lib/python3.9/site-packages/xarray_einstats/numba.py:110: in histogram
    da = _remove_indexes_to_reduce(da, dims).stack({aux_dim: dims})
../stage/usr/local/lib/python3.9/site-packages/xarray_einstats/__init__.py:47: in _remove_indexes_to_reduce
    da = da.drop_indexes(indexes_to_remove)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <xarray.DataArray 'mu' (chain: 4, draw: 10, team: 6)>
1.987 0.7502 1.301 0.5294 0.03015 0.4339 ... 0.774 0.499 0.4961 ...    (team) <U1 'a' 'b' 'c' 'd' 'e' 'f'
  * chain    (chain) int64 0 1 2 3
  * draw     (draw) int64 0 1 2 3 4 5 6 7 8 9
name = 'drop_indexes'

    def __getattr__(self, name: str) -> Any:
        if name not in {"__dict__", "__setstate__"}:
            # this avoids an infinite loop when pickle looks for the
            # __setstate__ attribute before the xarray object is initialized
            for source in self._attr_sources:
                with suppress(KeyError):
                    return source[name]
>       raise AttributeError(
            f"{type(self).__name__!r} object has no attribute {name!r}"
        )
E       AttributeError: 'DataArray' object has no attribute 'drop_indexes'

/usr/local/lib/python3.9/site-packages/xarray/core/common.py:256: AttributeError

Version: 0.5.0
Python-3.9
FreeBSD 13.1

tests/test_linalg.py::TestWrappers::test_svd - ValueError: new dimensions ... must be a superset ...

Hi! Observing the following tests failing in xarray-einstats 0.6.0:

python3.11-xarray-einstats> >       s_full.loc[{"dim": idx, "dim2": idx}] = s_da                                                                                                                                     
python3.11-xarray-einstats> tests/test_linalg.py:249:  
...
python3.11-xarray-einstats> E           ValueError: new dimensions ('batch', 'experiment', 'pointwise_sel') must be a superset of existing dimensions ('batch', 'experiment', 'dim')
...
python3.11-xarray-einstats> E           ValueError: new dimensions ('batch', 'pointwise_sel', 'dim2') must be a superset of existing dimensions ('batch', 'experiment', 'dim2')
...
python3.11-xarray-einstats>   /build/source/tests/test_numba.py:66: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
python3.11-xarray-einstats>     out = ecdf(data["mu"], npoints=data["mu"].size)
python3.11-xarray-einstats> -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
python3.11-xarray-einstats> =========================== short test summary info ============================
python3.11-xarray-einstats> FAILED tests/test_linalg.py::TestWrappers::test_svd - ValueError: new dimensions ('batch', 'experiment', 'pointwise_sel') must be...
python3.11-xarray-einstats> FAILED tests/test_linalg.py::TestWrappers::test_svd_non_square - ValueError: new dimensions ('batch', 'pointwise_sel', 'dim2') must be a sup...
python3.11-xarray-einstats> ============ 2 failed, 258 passed, 1 skipped, 5 warnings in 28.80s =============
...

...with the following dependencies (seem to satisfy the constraints in pyproject.toml):

  "/nix/store/pkj9pa0wvflgbbkh33r2my604sh6hljp-python3.11-numba-0.58.1.drv",
  "/nix/store/sd9592awi9vhak7w673wpq5nmxgmnkhl-python3.11-scipy-1.11.4.drv",
...
  "/nix/store/vk8p7n865vavwz0c6plmyykl75g2xypz-python3.11-xarray-2023.11.0.drv",
  "/nix/store/wv8c3ck1dx0qxzv049w5ris4acwhgnrg-python3.11-einops-0.7.0.drv",
  "/nix/store/xf4ckl07j2kasl8pzkwfbfzmxzrsrghi-python3.11-numpy-1.26.2.drv",

distutils.errors.DistutilsOptionError: No configuration found for dynamic 'description'.

Build fails on FreeBSD:

/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py:102: _ExperimentalProjectMetadata: Support for project metadata in `pyproject.toml` is still experimental and may be removed (or change) in future releases.
  warnings.warn(msg, _ExperimentalProjectMetadata)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "setup.py", line 1, in <module>
    import setuptools; setuptools.setup()
  File "/usr/local/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "/usr/local/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 122, in setup
    dist.parse_config_files()
  File "/usr/local/lib/python3.8/site-packages/setuptools/dist.py", line 854, in parse_config_files
    pyprojecttoml.apply_configuration(self, filename, ignore_option_errors)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 54, in apply_configuration
    config = read_configuration(filepath, True, ignore_option_errors, dist)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 134, in read_configuration
    return expand_configuration(asdict, root_dir, ignore_option_errors, dist)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 189, in expand_configuration
    return _ConfigExpander(config, root_dir, ignore_option_errors, dist).expand()
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 236, in expand
    self._expand_all_dynamic(dist, package_dir)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 271, in _expand_all_dynamic
    obtained_dynamic = {
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 272, in <dictcomp>
    field: self._obtain(dist, field, package_dir)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 309, in _obtain
    self._ensure_previously_set(dist, field)
  File "/usr/local/lib/python3.8/site-packages/setuptools/config/pyprojecttoml.py", line 295, in _ensure_previously_set
    raise OptionError(msg)
distutils.errors.DistutilsOptionError: No configuration found for dynamic 'description'.
Some dynamic fields need to be specified via `tool.setuptools.dynamic`
others must be specified via the equivalent attribute in `setup.py`.
*** Error code 1

Version: 0.2.2
Python-3.8
FreeBSD 13.1

Support non-string dimension names in einops module

It should be possible to allow any hashable when using lists as input for pattern and pattern_aux. This issue is to track work on this. It might also be possible to be a bit more flexible with types, but I am not very versed in that, so help is very welcome on any of this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.