Code Monkey home page Code Monkey logo

eva's People

Contributors

a-thiery avatar ioangatop avatar nkaenzig avatar renovate[bot] avatar roman807 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

eva's Issues

Add a model wrapper for functions

A lot of libraries define models though functions. However, the yaml passing method, due to type verification, can not support them.

Thus, we would like to have a class wrapper, which calls the functions, build and initialises the model in the class.

Add image IO utils

We need functions to load image files into numpy arrays or torch tensors.

Add central logger

We would like to have a central logger to be used across the library. A great choice would be loguru.

Add CI manager

We would like to add support of a CI manager, for managing multiple Python test environments. For that, we recommend nox.

nox is a command-line tool that automates testing in multiple Python environments, which uses a standard Python file for configuration. Nox will automatically create a virtualenv with the appropriate interpreter, install the specified dependencies, and run the commands in order.

Add a module for generating splits

Would be good to add a module / utils file that implements some of the split methods we use, such that we can share that logic among the different dataset classes.

Examples:

  • ordered splits
  • random splits
    • std
    • balanced / stratified

Also, should be able to choose between train/val/test or train/test.

Add common image transforms

In order to minimise the code, configs and increase the reliability, we would like to add some common image transforms

Initialise a vision library group

In order for eva to support multiple modalities efficiently, meaning to be able to share core code but not need to install the package dependencies for all modalities, we would like to any modality as an optional group of the library.

We will start with the vision one, called eva-vision.

Enable cache in CI

To speed up the CI, we can enable the cache options that the setup-pdm workflow provides.

Add a LinearClassifier model module

For downstream evaluation tasks, one of the most popular ones is a simple linear classification head on a foundation model.

We should have native support of this downstream task.

Ignore `.pdm-python` file

The .pdm-python stores the Python path used by the current project and doesn't need to be shared.

Add core model module

We would like to have an abstract lightning model module which would act as the base class of all the models

Add base dataset class for vision tasks

We need to design a base class to support vision tasks such as classification & segmentation. Note that the benchmark datasets can come in a variety of different format, so we need to decide if the dataset classes should use the original raw data or if we transform to a standardised format.

Add the core data modules

The data related core modules will compose the base of all the data related modules. These will be the dataset, the dataloader and the datamodule.

Drop the `Defaults to` in the code docs

In general we are following Google Style Python Docstrings (here an example) which suggest to use Defaults to to document parameters that have a default value. However, the default value is visible in the function arguments, so there is a notion of duplication.

Thus we will have an exception to the Docstrings style, were we omit the Defaults to document.

Add dataset class for BACH

BACH will be the first dataset we add to eva. We should design a dataset class that implements the prepare_data and setup method from VisionDataset.

Add ABMIL network

For evaluating slide level tasks we need implementations of MIL networks such as ABMIL

Introduce a STYLE_GUIDE

Lets introduce a style guide document, to highlight our codings and documentation style, in order to me known and consistent across the library

Export core classes in eva public API

The issue suggest to expose some core and frequently used classes in eva public API to make easier ti use.

For example:

# instead of this
from eva.data import datamodules
from eva.models import modules

data = datamodules.DataModule()
model = modules.HeadModule()

# to do the following
import eva

model = eva.DataModule()
model = eva.HeadModule()

Align dataset readmes

Make sure structure of readme's for different datasets (e.g. bach, patch camelyon) are consistent.

Add setup operations when library is imported

When we import or use the library, we would like to trigger some configurations automatically in order to have a better user experience.

This will be to automatically be default (but can be configured) to enable MPS fallback for Apple M chip users

Obtain OpenSSF best practices badge

OpenSSF provides a blueprint for following best practices with open source code. Many popular open-source projects (e.g. TensorFlow) obtained their badge

  • during development we want to keep following their guidelines (criteria)
  • once eva is published, we will apply to obtain the badge

Implement a class interface

To make it easier to interact with the library and its various components, we would like to have a common interface which encapsulates all the core part and tasks.

Apart from the python interface, this also will be used for the cli interface

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

github-actions
.github/workflows/ci.yaml
  • actions/checkout v4
  • actions/checkout v4
  • pdm-project/setup-pdm v4
  • wntrblm/nox 2024.03.02
  • actions/checkout v4
  • pdm-project/setup-pdm v4
  • wntrblm/nox 2024.03.02
.github/workflows/docs.yaml
  • actions/checkout v4
  • pdm-project/setup-pdm v4
  • wntrblm/nox 2024.03.02
.github/workflows/release.yaml
  • actions/checkout v4
  • pdm-project/setup-pdm v4
  • wntrblm/nox 2024.03.02
pep621
pyproject.toml
  • lightning >=2.2.1
  • jsonargparse >=4.27.4
  • tensorboard >=2.16.2
  • loguru >=0.7.2
  • pandas >=2.2.0
  • transformers >=4.38.2
  • onnxruntime >=1.17.1
  • onnx >=1.16.0
  • toolz >=0.12.1
  • rich >=13.7.1
  • vision/h5py >=3.10.0
  • vision/nibabel >=5.2.0
  • vision/opencv-python-headless >=4.9.0.80
  • vision/timm >=0.9.12
  • vision/torchvision >=0.17.0
  • all/h5py >=3.10.0
  • all/nibabel >=5.2.0
  • all/opencv-python-headless >=4.9.0.80
  • all/timm >=0.9.12
  • all/torchvision >=0.17.0
  • lint/isort >=5.12.0
  • lint/black >=23.1.0
  • lint/ruff >=0.0.254
  • lint/yamllint >=1.29.0
  • lint/bandit >=1.7.6
  • typecheck/pyright >=1.1.295
  • typecheck/pytest >=7.2.2
  • typecheck/nox >=2024.3.2
  • test/pygments >=2.14.0
  • test/pytest >=7.2.2
  • test/pytest-cov >=4.1.0
  • docs/mkdocs >=1.5.3
  • docs/mkdocs-material >=9.5.6
  • docs/mkdocstrings >=0.24.0
  • docs/mike >=2.0.0
  • docs/setuptools >=62.3.3
  • docs/markdown-exec >=0.7.0
  • docs/mkdocs-redirects >=1.2.0
  • docs/mkdocs-version-annotations >=1.0.0
  • dev/isort >=5.12.0
  • dev/black >=23.1.0
  • dev/ruff >=0.0.254
  • dev/yamllint >=1.29.0
  • dev/bandit >=1.7.6
  • dev/pyright >=1.1.295
  • dev/pytest >=7.2.2
  • dev/nox >=2024.3.2
  • dev/pygments >=2.14.0
  • dev/pytest >=7.2.2
  • dev/pytest-cov >=4.1.0
  • dev/mkdocs >=1.5.3
  • dev/mkdocs-material >=9.5.6
  • dev/mkdocstrings >=0.24.0
  • dev/mike >=2.0.0
  • dev/setuptools >=62.3.3
  • dev/markdown-exec >=0.7.0
  • dev/mkdocs-redirects >=1.2.0
  • dev/mkdocs-version-annotations >=1.0.0

  • Check this box to trigger a request for Renovate to run again on this repository

Add sys args to nox test

Adding sys args to the nox test function would allow us to run commands as nox -s test -- tests/test_fake.py:

@nox.session(python=PYTHON_VERSIONS, tags=["test"])
def test(session: nox.Session) -> None:
    """Runs the unit tests of the source code."""
    args = session.posargs or LOCATIONS
    session.run("pdm", "install", "--group", "dev", external=True)
    session.run("pdm", "run", "pytest", "--cov", *args)

However, doing just that gives the following error when running nox -s test:

noxfile.py E                                                                                                                                              [100%]

============================================================================ ERRORS =============================================================================
____________________________________________________________________ ERROR at setup of test _____________________________________________________________________
file /Users/nkaenzig/workspace/eva-worktrees/eva-2/noxfile.py, line 61
  @nox.session(python=PYTHON_VERSIONS, tags=["test"])
  def test(session: nox.Session) -> None:
E       fixture 'session' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, cov, doctest_namespace, monkeypatch, no_cover, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

Add metadata to dataset items

For a variety of evaluation scenarios (e.g. calculating metrics) we need to access additional metadata fields such as the slide-id, patient-id or the medical center. Currently our data classes only return image, target tuples.

Two options:

  1. Tuples with an optional third value for metadata
class TUPLE_INPUT_BATCH(NamedTuple):
    """The tuple input batch data scheme."""

    data: torch.Tensor
    """The data batch."""

    targets: torch.Tensor | None = None
    """The target batch."""

    metadata: Dict[str, Any] | None = None
    """The associated metadata."""
  1. Batch dictionaries
class DatasetSample(TypedDict):
    data: Union[torch.Tensor, List[torch.Tensor]]
    """The image batch, depending on the transforms used the input will be different."""

    target: torch.Tensor | None
    """ "The target batch."""

    metadata: Dict[str, Any] | None
    """The associated metadata."""

However, there are drawbacks to using dictionaries as outlined here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.