Code Monkey home page Code Monkey logo

etils's Introduction

Etils

Unittests PyPI version Documentation Status

etils (eclectic utils) is an open-source collection of utils for python.

Each top-level submodule is a self-contained independent module (with its own BUILD rule), meant to be imported individually. To avoid collisions with other modules/variables, module names are prefixed by e (arbitrary convention):

from etils import epath  # Path utils
from etils import epy  # Python utils
from etils import ejax  # Jax utils
...

Becauses each module is independent, only the minimal required libraries are imported (for example, importing epy won't suffer the cost of importing TF, jax,...)

Documentation

Installation

Because each module is independent and require different dependencies, you can select which modules deps to install:

pip install etils[array_types,epath,epy]

This is not an official Google product.

etils's People

Contributors

andsteing avatar conchylicultor avatar daniel-character avatar ebrevdo avatar fineguy avatar hawkinsp avatar hbq1 avatar hmeyer avatar joshiayush avatar liangyaning33 avatar marcenacp avatar peterzhizhin avatar ppwwyyxx avatar qwlouse avatar rchen152 avatar tomvdw avatar truncs avatar yilei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

etils's Issues

pathlib.Path.rglob is recursive but etils.epath.Path.rglob isn't

Was trying to add gs://... support to a TFDS builder that was working with local files, and figured I'd try to simply replace pathlib with etils.epath since it was already in tfds.core.Path and would involve minimal changes.

Unfortunately got tripped up on

AssertionError: Not a single example present in the PCollection! [while running 'test_write/GetBoundaries']

which seems to be caused by rglob not actually being recursive, as per: https://github.com/google/etils/blob/main/etils/epath/abstract_path.py#L111

Mentioning this here just in case anyone else got confused by the same thing.

Example

In [1]: import pathlib

In [2]: import etils.epath

In [3]: list(etils.epath.Path('.').rglob("*.py"))
Out[3]: [PosixGPath('builders/__init__.py')]

In [4]: list(pathlib.Path('.').rglob("*.py"))
Out[4]: 
[PosixPath('builders/__init__.py'),
 PosixPath('builders/my_dataset/my_dataset_test.py'),
 PosixPath('builders/my_dataset/my_dataset.py'),
 PosixPath('builders/my_dataset/__init__.py')]

cannot "pip install etils"

Hi,

I cannot install etils[path] through pip. This is the following error message:

ERROR: Could not find a version that satisfies the requirement etils[epath] (from versions: none)
ERROR: No matching distribution found for etils[epath]

Thank you in advance!

AttributeError when calling etils.eapp.better_logging

Hello,

When I am trying to use eapp.better_logging, it seems the module epy is trying to call a non-existent method is_borg

Traceback (most recent call last):
  File "<project path>/test_etils.py", line 20, in <module>
    app.run(main, flags_parser=eapp.make_flags_parser(Args))
  File "<env path>/lib/python3.9/site-packages/absl/app.py", line 306, in run
    callback()
  File "<env path>/lib/python3.9/site-packages/etils/eapp/logging_utils.py", line 64, in _better_logging
    if epy.is_borg():
AttributeError: module 'etils.epy' has no attribute 'is_borg'

Minimal code for reproducing the issue with python3.9:

import dataclasses

from absl import app
from etils import eapp


@dataclasses.dataclass
class Args:  # Define `--user=some_user --verbose` CLI flags
    user: str
    verbose: bool = False


def main(args: Args):
    if args.verbose:
        print(args.user)


if __name__ == "__main__":
    eapp.better_logging()
    app.run(main, flags_parser=eapp.make_flags_parser(Args))

Best,
Hicham

When using edc and frozen/unfrozen pylint and mypy complain

Hello,

Thanks for the cool library. When using the edc module and un particular frozen/unfrozen function both mypy and pylint complain about types.

This is fairly easy to reproduce, just by using the following snippet:

from dataclasses import dataclass
from etils import edc

@edc.dataclass(allow_unfrozen=True)
@dataclass(frozen=True)
class MyConf:
	my_var: str = "epic"

if __name__ == "__main__":
    qqq = MyConf(my_var="ff")
    
    # pylint and mypy complain!
    fff = qqq.unfrozen()
    ppp = fff.frozen()

Specifically pylint has the following output:

E1101: Instance of 'MyConf' has no 'unfrozen' member (no-member)

whereas mypy gives:

"MyConf" has no attribute "unfrozen"  [attr-defined]

I suspect both are for the same reason... is there a clean way (without ignoring these) to use this module while making both mypy and pylint happy?

For what is worth - I am using Ubuntu 22.04, python 3.10, pylint 2.14.5, and mypy 0.971.

Support walk in epath?

hi there!

is there a plan to support walk in epath for different backends? i can contribute if this feels like a good idea.

broken link in ecolab/docs/demo.ipynb

In ecolab/docs/demo.ipynb replace

View all available imports in the code: https://github.com/google/etils/tree/main/etils/ecolab/lazy_imports.py;l=412

with

View all available imports in the code: https://github.com/google/etils/tree/main/etils/ecolab/lazy_imports.py#l=412

pip installing etils spews warnings

I've tried doing a pip3 install etils[all] and then running a subsequent command that installs jax and by trans deps etils, and I get this error polluting the terminal:

WARNING: etils 0.2.0 does not provide the extra 'edc'
WARNING: etils 0.3.3 does not provide the extra 'edc'
WARNING: etils 0.3.2 does not provide the extra 'edc'
WARNING: etils 0.3.1 does not provide the extra 'edc'
WARNING: etils 0.3.0 does not provide the extra 'edc'
WARNING: etils 0.2.0 does not provide the extra 'edc'
WARNING: etils 0.3.3 does not provide the extra 'edc'
WARNING: etils 0.3.2 does not provide the extra 'edc'
WARNING: etils 0.3.1 does not provide the extra 'edc'
WARNING: etils 0.3.0 does not provide the extra 'edc'
WARNING: etils 0.2.0 does not provide the extra 'edc'
WARNING: etils 0.3.3 does not provide the extra 'edc'
WARNING: etils 0.3.2 does not provide the extra 'edc'
WARNING: etils 0.3.1 does not provide the extra 'edc'
WARNING: etils 0.3.0 does not provide the extra 'edc'
WARNING: etils 0.2.0 does not provide the extra 'edc'
WARNING: etils 0.3.3 does not provide the extra 'edc'

Can't you just integrate your code into jax or something? this lib has ELEVEN different sub-targets. you are creating a dependency game that really really nobody wants. at the very least please consider a way to silence this error, even if it means minimizing your package structure and getting your code into upstream.

[Enhancement] Add `optree` integration to `etils.etree`

optree is a standalone package (like dm-tree) aimed to high-performance PyTree manipulation (like jax.tree_util). It offers similar APIs to jax.tree_util but better.

Some initial benchmark results:

Average Time Cost (โ†“) OpTree (v0.9.0) JAX XLA (v0.4.6) PyTorch (v2.0.0) TensorFlow Nest (v2.12.0) DM-Tree (v0.1.8)
Tree Flatten x1.00 2.33 22.05 1.38 1.12
Tree UnFlatten x1.00 2.69 4.28 13.69 16.23
Tree Flatten with Path x1.00 16.16 Not Supported 21.10 27.59
Tree Copy x1.00 2.56 9.97 9.62 11.02
Tree Map x1.00 2.56 9.58 9.16 10.62
Tree Map (nargs) x1.00 2.89 Not Supported 74.26 31.33
Tree Map with Path x1.00 7.23 Not Supported 40.78 19.66
Tree Map with Path (nargs) x1.00 6.56 Not Supported 69.63 29.61

We have already seen some etils folks get involved with optree and jax.tree_util discussions. I wonder if etils maintainers have interest to add optree to etils.etree.

Ref:

etils example fails to load on public colab due to missing mediapy dependency

running the example colab out of the box on the public instance of Google Colab (colab.research.google.com) immediately fails with:

[/usr/local/lib/python3.10/dist-packages/etils/ecolab/array_as_img.py](https://localhost:8080/#) in <module>
     37   import IPython
     38   import IPython.display
---> 39   import mediapy as media
     40   # pylint: enable=g-import-not-at-top
     41 

ModuleNotFoundError: No module named 'mediapy'

Each etils sub-modules require deps to be installed separately (e.g. `from etils import ecolab` -> `pip install etils[ecolab]`)

adlfs support

Hey y'all,

Currently you offer support for s3:// (using s3fs) and gs:// (using gcfs). If it's not too much work, could you please add support for az:// aka Azure blob using fsspec's adlfs.

epath behaves weird using fsspec on gcs

Via GCloud Console I clicked "CREATE FOLDER" to create empty_folder.
Then I did:

$ touch /tmp/test.txt
$ gsutil cp /tmp/test.txt gs://henning-test/folder/test.txt

$ python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from etils import epath
>>> epath.gpath._is_tf_installed()
False
>>> e = epath.Path('gs://henning-test/empty_folder')
>>> f'{e.exists()=} {e.is_dir()=}'
'e.exists()=True e.is_dir()=False'
>>> list(e.iterdir())
[PosixGPath('gs://henning-test/empty_folder')]

>>> f = epath.Path('gs://henning-test/folder')
>>> f'{f.exists()=} {f.is_dir()=}'
'f.exists()=True f.is_dir()=False'

>>> t = epath.Path('gs://henning-test/folder/test.txt')
>>> f'{t.exists()=} {t.is_dir()=}'
't.exists()=True t.is_dir()=False'
>>> list(t.iterdir())
[PosixGPath('gs://henning-test/folder/test.txt/test.txt')]

Package broken under Python 3.9

The epath sub-package is broken under Python 3.9, apparently due to the recent introduction of syntax only supported in Python 3.10 or later in binary_import.py:

import etils.epath

...

File ~/condahome/miniconda3/envs/py39t/lib/python3.9/site-packages/etils/epy/binary_import.py:55
     49       return True
     50   return False
     53 @contextlib.contextmanager
     54 def binary_adhoc(
---> 55     restrict: None | py_utils.StrOrStrList = None,
     56     verbose: bool = False,
     57     **kwargs: Any,
     58 ) -> Iterator[None]:
     59   yield

TypeError: unsupported operand type(s) for |: 'NoneType' and '_UnionGenericAlias'

v1.5.2: Test fails for python3.11

I am on Arch and while installing the package, I see the following test failures

========================================================================================================================================================== FAILURES ==========================================================================================================================================================
__________________________________________________________________________________________________________________________________________________ test_interp_scalar[jnp] ___________________________________________________________________________________________________________________________________________________

xnp = <module 'jax.numpy' from '/usr/lib/python3.11/site-packages/jax/numpy/__init__.py'>

    @enp.testing.parametrize_xnp()
    def test_interp_scalar(xnp: enp.NpModule):
      vals = xnp.asarray(
          [
              [-1, -1],
              [-1, 0],
              [-1, 1],
              [0.5, 1],
              [1, 1],
          ]
      )
    
      #
    
      out = enp.interp(vals, from_=(-1, 1), to=(0, 256))
      assert enp.compat.is_array_xnp(out, xnp)
    
      np.testing.assert_allclose(
          out,
          xnp.asarray([
              [0, 0],
              [0, 128],
              [0, 256],
              [192, 256],
              [256, 256],
          ]),
      )
      np.testing.assert_allclose(
          enp.interp(vals, from_=(-1, 1), to=(0, 1)),
          xnp.asarray([
              [0, 0],
              [0, 0.5],
              [0, 1],
              [0.75, 1],
              [1, 1],
          ]),
      )
    
      vals = xnp.asarray(
          [
              [255, 255, 0],
              [255, 128, 0],
              [255, 0, 128],
          ]
      )
>     np.testing.assert_allclose(
          enp.interp(vals, from_=(0, 255), to=(0, 1)),
          xnp.asarray([
              [1, 1, 0],
              [1, 128 / 255, 0],
              [1, 0, 128 / 255],
          ]),
      )

etils/enp/interp_utils_test.py:70: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x7fe9a8bc6840>, array([[0.99999994, 0.99999994, 0.        ],
       [0...     , 0.       ],
       [1.       , 0.5019608, 0.       ],
       [1.       , 0.       , 0.5019608]], dtype=float32))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=1e-07, atol=0', 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=1e-07, atol=0
E           
E           Mismatched elements: 2 / 9 (22.2%)
E           Max absolute difference: 5.9604645e-08
E           Max relative difference: 1.1874362e-07
E            x: array([[1.      , 1.      , 0.      ],
E                  [1.      , 0.501961, 0.      ],
E                  [1.      , 0.      , 0.501961]], dtype=float32)
E            y: array([[1.      , 1.      , 0.      ],
E                  [1.      , 0.501961, 0.      ],
E                  [1.      , 0.      , 0.501961]], dtype=float32)

/usr/lib/python3.11/contextlib.py:81: AssertionError

Tests fail on Python 3.10

There are two test failures with Python 3.10. With Python 3.9 everything seems fine. Could you have a look?

================================================================================== FAILURES ==================================================================================
_________________________________________________________________________________ test_repr __________________________________________________________________________________

    def test_repr():
>     assert repr(R(123, R11(y='abc'))) == epy.dedent("""
      R(
          x=123,
          y=R11(
              x=None,
              y='abc',
              z=None,
          ),
      )
      """)
E     assert "R(x=123, y=R...bc', z=None))" == 'R(\n    x=12...e,\n    ),\n)'
E       + R(x=123, y=R11(x=None, y='abc', z=None))
E       - R(
E       -     x=123,
E       -     y=R11(
E       -         x=None,
E       -         y='abc',
E       -         z=None,...
E
E       ...Full output truncated (3 lines hidden), use '-vv' to show

etils/edc/dataclass_utils_test.py:108: AssertionError
_____________________________________________________________________________ test_resource_path _____________________________________________________________________________

    def test_resource_path():
      path = epath.resource_utils.ResourcePath(_make_zip_file())
      assert isinstance(path, os.PathLike)
      assert path.joinpath('b/c.txt').read_text() == 'content of c'
      sub_dirs = list(path.joinpath('b').iterdir())
      assert len(sub_dirs) == 3
      for p in sub_dirs:  # Childs should be `ResourcePath` instances
        assert isinstance(p, epath.resource_utils.ResourcePath)

      # Forwarded to `Path` keep the resource.
      path = epath.Path(path)
      assert isinstance(path, epath.resource_utils.ResourcePath)

>     assert path.joinpath() == path
E     AssertionError: assert ResourcePath('alpharep.zip', '') == ResourcePath('alpharep.zip', '')
E      +  where ResourcePath('alpharep.zip', '') = <bound method Path.joinpath of ResourcePath('alpharep.zip', '')>()
E      +    where <bound method Path.joinpath of ResourcePath('alpharep.zip', '')> = ResourcePath('alpharep.zip', '').joinpath

For test_repr, apparently custom __repr__ is not applied as __qualname__ is changed in Python 3.10.

For test_resource_path, joinpath() returns a new object for Python >= 3.10 as that function is not overridden.

I noticed those failures when I'm creating a unofficial package python-etils for Arch Linux as a new dependency for the latest python-tensorflow-datasets.

Environment: Arch Linux x86_64, Python 3.10.4

Using etils's `epath` with AWS S3 links

Hey,

Apparently, one can't use S3 objects with etils:

from etils import epath
a = epath.Path( 's3://....' ) #link points to a folder of .tfrecords
sorted(epath.Path('s3://..').iterdir())

Error

Yields,

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/etils/epath/gpath.py", line 126, in iterdir
    for f in self._backend.listdir(self._path_str):
  File "/usr/local/lib/python3.8/dist-packages/etils/epath/backend.py", line 191, in listdir
    return self.gfile.listdir(path)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 766, in list_directory_v2
    raise errors.NotFoundError(
tensorflow.python.framework.errors_impl.NotFoundError: Could not find directory s3://..

This utilized in TFDS simply to compute some meta-data for the datasets I'd be working in.

I can confirm the given link exists when copypasted to the aws s3 ls s3://.... command. It seems an issue with etils, or I'm using it wrongly.

Cheers!

edc: unfrozen dataclasses are hashble

I think the general consensus in Python is that mutable types should not be hashable; however unfrozen dataclasses are hashable. See code below:

import dataclasses
from etils import edc
@edc.dataclass(allow_unfrozen=True)
@dataclasses.dataclass(eq=True, kw_only=True, frozen=True)
class Conf:
    x: int = 3
a = Conf()
a = a.unfrozen()
a.x = 2
hash(a) # should fail

ps. edc is really handy

etils.epath.Path.glob() removes gs:// prefix

I am trying to iterate over a "directory" in a bucket using glob(). Unfortunately the returned object Path objects are missing the gs:// part:

dir = Path(
        "gs://gcp-public-data-arco-era5/ar/1959-2022-1h-240x121_equiangular_with_poles_conservative.zarr/lake_depth"
    )
files = list(dir.glob("*"))
files[0].read_bytes() # fails with FileNotFoundError

Is this intended behavior?

Broken import of ecolab

The import of ecolab is broken even on the demo colab with the following error:

AttributeError: 'TransformerManager' object has no attribute 'python_line_transforms'
Full error
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-3-aa5e0d6acf5e>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from etils.lazy_imports import *

2 frames
[/usr/local/lib/python3.10/dist-packages/etils/lazy_imports/__init__.py](https://localhost:8080/#) in <module>
     15 """Alias of `etils.ecolab.lazy_imports`."""
     16 
---> 17 from etils.ecolab import lazy_imports
     18 from etils.ecolab.lazy_imports import *
     19 

[/usr/local/lib/python3.10/dist-packages/etils/ecolab/__init__.py](https://localhost:8080/#) in <module>
     33 
     34 # Activate auto-display by default
---> 35 auto_display()

[/usr/local/lib/python3.10/dist-packages/etils/ecolab/auto_display_utils.py](https://localhost:8080/#) in auto_display(activate)
     61   _clear_transform(shell.ast_transformers)
     62   if _IS_LEGACY_API:
---> 63     _clear_transform(shell.input_transformer_manager.python_line_transforms)
     64     _clear_transform(shell.input_splitter.python_line_transforms)
     65   else:

AttributeError: 'TransformerManager' object has no attribute 'python_line_transforms'
Screen capture

image

Freezing unfrozen dataclasses requires unfreezing them beforehand.

import dataclasses
from etils import edc
@edc.dataclass(allow_unfrozen=True)
@dataclasses.dataclass(eq=True, kw_only=True)
class Conf:
    x: int = 3

a = Conf()
a.x = 2 # verify that a is not frozen
a.frozen() # complains that .frozen() can only be called after .unfrozen()

As the code above shows, allow_unfrozen seems to assume that the dataclass is frozen by default. I think that it should verify the assumption.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.