Code Monkey home page Code Monkey logo

ipyannotator's Introduction

Ipyannotator - the infinitely hackable annotation framework

CI-Badge DOI

Ipyannotator is a flexible annotation system. Developed to allow users to hack its features by extending and customizing it.

The large variety of annotation tasks, data formats and data visualizations is a challenging when dealing with multiple domains of supervised machine learning (ML). The existent tooling is often not flexible enough which imposes limitations to the user. By providing a framework where users can use, customize and create their own annotation tooling this projects aims to solve this problem.

The library contains some pre-defined annotators that can be used out of the box, but it also can be extend and customized according to the users needs. Check our tutorials for a quickly understanding of it's usage and check our API for quick reference.

This library has been written in the literate programming style popularized for jupyter notebooks by nbdev. In addition to our online documentation the jupyter notebooks located at nbs/ allow an interactive exploration of the inner workings of Ipyannotator.

We hope this repository helps you to explore how annotation UI's can be quickly built using only python code and leveraging many awesome libraries (ipywidgets, voila, ipycanvas, etc.) from the jupyter Eco-system.

At https://palaimon.io we have used the concepts underlying Ipyannotator internally for various projects and this is our attempt to contribute back to the OSS community some of the benefits we have had using OOS software.

Please star, fork and open issues!

Please let us know if you find this repository useful. Your feedback will help us to turn this proof of concept into a comprehensive library.

Citation

If you make use of this software for your work we would appreciate it if you would cite the paper from the Journal of Open Source Software:

@article{Epifânio2022,
   title = {Ipyannotator: the infinitely hackable annotation framework},
   author = {Ítalo Epifânio and Oleksandr Pysarenko and Immanuel Bayer},
   journal = {Journal of Open Source Software},
   publisher = {The Open Journal},
   volume = {7},
   number = {76},
   pages = {4480},
   year = {2022}
} 

Install

Ipyannotator is available on Pypi and can be installed using:

pip install ipyannotator

Running ipyannotator

Ipyannotator provides a simple API that provides the ability to explore, create and improve annotation datasets by using a pair of input/outputs. All pair of input/output are listed on Ipyannotator's docs. Check Ipyannotator tutorials for a quickly demonstration of the library.

Run ipyannotator tests

To run Ipyannotator's tests:

  1. Install poetry
  2. Create the test environment with poetry install
  3. Activate the poetry environment using poetry shell
  4. Run tests by executing nbdev_test_nbs

Run ipyannotator as stand-alone web app using voila

Ipyannotator can be executed as a web app using the voila library. The following sections describe how to run using poetry and pip.

Using poetry

On your terminal:

cd {project_root}
poetry install --no-dev

Any jupyter notebook with ipyannotator can be executed as an standalone web application. An example of voila usage it's available in the current repository and can be executed as it follow:

poetry run voila nbs/09_voila_example.ipynb --enable_nbextensions=True

Using pip

The installation and execution process can also be done using pip.

   cd {project_root}
   
   pip install .
   pip install voila
   
   voila nbs/09_voila_example.ipynb --enable_nbextensions=True

Jupyter lab trouble shooting

For clean (re)install make sure to have all the lab extencions active:

jupyter lab clean to remove the staging and static directories from the lab

ipywidgets:

jupyter labextension install @jupyter-widgets/jupyterlab-manager

ipycanvas:

jupyter labextension install @jupyter-widgets/jupyterlab-manager ipycanvas

ipyevents:

jupyter labextension install @jupyter-widgets/jupyterlab-manager ipyevents

nbdime:

nbdime extensions --enable [--sys-prefix/--user/--system]

viola:

jupyter labextension install @jupyter-voila/jupyterlab-preview

How to contribute

Check out CONTRIBUTING.md and since ipyannotator is build using nbdev reading the nbdev tutorial and related docs will be very helpful.

Additional resources

jupytercon 2020

Acknowledgements

The authors acknowledge the financial support by the Federal Ministry for Digital and Transport of Germany under the program mFUND (project number 19F2160A).

Copyright

Copyright 2022 onwards, Palaimon GmbH. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this project's files except in compliance with the License. A copy of the License is provided in the LICENSE file in this repository.

ipyannotator's People

Contributors

alexjoz avatar dependabot[bot] avatar ibayer avatar itepifanio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ipyannotator's Issues

[JOSS Review] Suggested revisions on paper quality

For suggested revisions for openjournals/joss-reviews#4480:

  • The paper is too long and JOSS papers are not meant to describe the API. Much of the section "A simple but flexible API to define annotation tasks" should be moved to the documentation, where it would be quite useful and welcome, though a simplified summary of the explore, create, improve workflow can remain in the paper. "Key Design Decisions" should also be cut and moved to a section of the documentation as well. It would probably be helpful to reread the Submitting a paper to JOSS instructions on what the paper should cover.

  • In the Acknowledgements section you mention

The authors acknowledge the financial support by the Federal Ministry for Digital and Transport of Germany under under the program mFUND for the project OS-VAT (project number 19F2160A).

Congratulations on the funding! If possible can you also include a reference or hyperlink to the project funding page? I'm not sure if there is another page with your actual propsal of if https://www.bmvi.de/SharedDocs/DE/Artikel/DG/mfund-projekte/os-vat.html is the actual government page for it.

Settings API usage

Ipyannotator uses an API that contains a pair of Input/Output and a Settings class. The Settings contains parameters that are used for all annotators, but some of them are redundant.

This task tries to reduce the Settings class to three parameters:

class Settings:
    project_path: Path = Path('user_project')
    project_file: Optional[Path] = None
    result_dir: Optional[str] = None

The remaining parameters should be stored in Input/Output classes. This will avoid the following redundant code (from
01b_tutorial_image_classification.ipynb):

settings_ = get_settings(dataset)
settings_.project_file, settings_.image_dir
input_ = InputImage(image_dir=settings_.image_dir,
                    image_width=settings_.im_width,
                    image_height=settings_.im_height)

output_ = OutputImageLabel(label_dir=settings_.label_dir,
                           label_width=settings_.label_width,
                           label_height=settings_.label_height)

To do this the get_settings function needs to be refactored. A suggestion is to rename get_settings to get_api and return a tuple with input, output, and settings.

Cleanlab and Ipyannotator integration

Cleanlab is a tool that can detect label errors in datasets. The label error tutorial shows how the tool can be used to detect errors in Cifar-10 (and others datasets).

Create a new notebook that executes Cleanlab at the Cifar-10 dataset and use Ipyannotator to visualize all label errors detected by Cleanlab.

This Ipyannotator tutorial may help you with the Cifar-10 integration.

Make instructions on how to run tests explicit

This issue was created in response to this comment as part of the JOSS paper review.

#31 tries to improve the documentation of the tests.

Unfortunately the instructions remain ambiguous. Instructions on how to run tests should be as unambiguous and explicit as possible. I will explain what I mean by quoting from the improved instructions:


When installing the repository using poetry, all dev dependencies are installed by default.

Instructions on how to install via poetry only come later and include the --no-dev option – meaning that test dependencies are in fact not installed unless one infers that this option must be dropped.

When using pip for installation make sure to install the two dev dependencies pytest and ipytest, with the versions listed in pyproject.toml, manually:

pip install pytest
pip install ipytest

The instructions and the shown commands contradict each other since the latter will always install the latest version of the respective packages. Further, a version for pytest is not listed in the pyproject.toml file.


It is ok to be opinionated about how tests are supposed to be run for your package. If creating the correct environment is easier done using poetry then simply make that the recommended path, you can always provide instructions for alternative paths that can then be slightly less explicit. Here is my take at the instructions:

To run tests:

  1. Install poetry
  2. Create the test environment with $ poetry install
  3. Run tests by executing $ nbdev_test_nbs

Add support for multiple bbox per image

Motivation

ipyannotator currently support single object annotation with bounding boxes (bbox). However often a image contains more then one object of interest that should be annotated.

Develop and Integrate Multi BBox Widget

Suggested steps:

  • explore 01c_tutorial_bbox.ipynb to see the current bbox implementation in action
  • extend 01_bbox_canvas.ipynb with a new class MultiBBoxCanvas which supports displaying and drawing of multiple bboxs
  • duplicate 04_bbox_annotator.ipynb as 04b_multi_bbox_annotator.ipynb and replace the BBoxCanvas with MultiBBoxCanvas

Improve Ipyannotator API message error

Ipyannotator has an API to use its previously defined annotators. The API uses a pair of input/output and when this pair it's not configured the API should throw a friendly exception for the user.

Right now when a pair it's not correctly configured the API prints a friendly message (Pair (Annotator Input type: CustomInput, Annotator Output type: NoOutput) is not supported!) but also throws a random exception AttributeError: 'NoneType' object has no attribute 'get_annotator' this behavior can be reproduced using the following code:

from ipyannotator.mltypes import Input, Output
from ipyannotator.annotator import Annotator

class CustomInput(Input):
    pass

custom_input = CustomInput()
annotator = Annotator(custom_input)
annotator.explore()

The expected behavior it's:

  • Ipyannotator throws a friendly custom exception (ex. PairUnsupported)
  • Don't throw the AttributeError

[JOSS Review] Add automated tests in CI

For the JOSS documentation check of openjournals/joss-reviews#4480

Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

There are no automated tests (there are GitHub Action workflows for building and deploying the docs, but not for running the tests across all versions of Python supported). It seems that CI with some basic coverage of the API would greatly improve things.

For v0.8.3 there are no instructions on how to run the tests in the README (but they were added in for a future version in PR #34). It is additionally unclear what the test coverage of running the tests in the notebooks is, which would be important to know.

Create widget that supports drawing of polygons.

Motivation

ipyannotator currently support single object annotation with bounding boxes (bbox). However often a bounding box provides only
a very coarse localization of an instance. For tasks such as instance segmentation a sequence of (x, y) points is used to
define a polygon per object.

Polygon Annotation Widget

As a step towards supporting polygon annotations we need a widget that can display and accept user input for polygons.
Since polygon annotation can be seen as a generalization of bbox annotation a natural first step would be to use the
bbox canvas notebook 01_bbox_canvas.ipynb as blue print for a polygon widget.

Suggested steps.

  • Write code draw_polygon(...) to render a predefined polygon on an image.
  • Create a class class PolygoneCanvas(HBox, traitlets.HasTraits) that can capture user input to define a polygon.

[JOSS Review] Python version support and clarification of framework vs. application

For the JOSS functionality check of openjournals/joss-reviews#4480

Installation: Does installation proceed as outlined in the documentation?

Yes, but not in a satisfactory manner. The instructions are to run

$ pip install ipyannotator

however, the unversioned documentation (so need to use the README) and setup.py via settings.ini claims that ipyannotator supports Python 3.7+

min_python = 3.7

but that is in opposition with the active metadata on PyPI which states that Python 3.8+ is required. If you use a Python 3.7 runtime to install ipyannotator you get an old version of v0.4.0 which is not a version that is part of the review.

Additionally, because of the way that Poetry locks down things the dependencies that are listed in the documentation are very severely out of date/inaccurate. Instead of the 4 listed dependencies ipyannotator actually incurs 15 dependencies. The additonal constraints that are imposed with Poetry's syntax means that the claim that Python 3.8+ is supported is technically true, but there are multiple instances in which wheels are not available for modern CPython. For example, for Python 3.10 the Poetry constraint on scikit-image of

scikit-image = "^0.18.3"

translates to scikit-image<0.19.0,>=0.18.3 and as can be seen from PyPI and

$ python -m pip index versions scikit-image
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
scikit-image (0.19.3)
Available versions: 0.19.3, 0.19.2, 0.19.1, 0.19.0, 0.18.3, 0.18.2, 0.18.1, 0.18.0, 0.17.2, 0.17.1, 0.16.2, 0.15.0, 0.14.5, 0.14.3, 0.14.2, 0.14.1, 0.14.0, 0.13.1, 0.13.0, 0.12.3, 0.12.2, 0.12.1, 0.12.0, 0.11.3, 0.11.2, 0.10.1, 0.10.0, 0.9.3, 0.9.1, 0.9.0, 0.8.2, 0.8.1, 0.8.0, 0.7.2
   INSTALLED: 0.18.3
   LATEST:    0.19.3

that leaves exactly one viable version, scikit-image v0.18.3 for which there is no Python 3.10 wheel available and so the wheel has to be built from the sdist.

So while it is possible to install ipyannotator on Python 3.10 it takes quite some time and results in a large collection of dependencies

$ python -m pip freeze | wc -l
105

that are extremely constrained. This is of course technically fine, but the lines of what is a "framework" and what is an "application" are starting to merge quite a bit as if ipyannotator is meant to be used alongside other software in an environment it is far too restricting.

The documentation around the installation should try to clarify this or make it clear that the versions of Python that the authors intend the software to be used with are more constrained then what is currently shown.

Add support for class labeling of bbox

Motivation

ipyannotator currently support single object location with a bounding box (bbox). However annotating the type / class of the object is currently not supported.

Develop and Integrate Class Labeled BBox Widget

Suggested steps:

  • explore 01c_tutorial_bbox.ipynb to see the current bbox implementation in action
  • extend 01_bbox_canvas.ipynb with a new class LabeledBBoxCanvas which supports displaying and drawing of a class labeled bbox.
  • duplicate 04_bbox_annotator.ipynb as 04b_class_bbox_annotator.ipynb and replace the BBoxCanvas with LabeledBBoxCanvas

Use point set to shape mapping to improve polygon annotation widget

Motivation

The polygon annotation widget nbs/01d_polygon_canvas.ipynb #5 (not yet included in the public github release) supports only sequential polygon annotation. Adding additional boundary points after creating an initial polygon depends on the creation order of the boundary points which is not very intuitive and also slow to execute.

Defining the polygon by a set of points instead of a list would make adding and removing of points much simpler.

Alpha-Concave Hull

In computational geometry, an alpha shape, or α-shape, is a family of piecewise linear simple curves in the Euclidean plane associated with the shape of a finite set of points. They were first defined by Edelsbrunner, Kirkpatrick & Seidel (1983). The alpha-shape associated with a set of points is a generalization of the concept of the convex hull, i.e. every convex hull is an alpha-shape but not every alpha shape is a convex hull.
source: https://en.wikipedia.org/wiki/Alpha_shape

Alpha-Concave Hull [0] is one such algorithm that could be interested for us.

Suggested steps.

  • literature review to find simple algorithm to associate shape with set of points.
  • poc implementation of algorithm in 01d_polygon_canvas.ipynb
  • create minimal sequence diagram https://plantuml.com/sequence-diagram to specify expected user iteration for creating and deleting points
  • refactor current polygon annotation widget to support the new algorithm

[0] Alpha-Concave Hull, a Generalization of Convex Hull

Note: This is an internal issue till the polygon annotation widget #5 is released.

[JOSS Review] Suggested revisions on code quality

For suggested revisions for openjournals/joss-reviews#4480:

In the code there are multiple instances of things like

from nbdev import *
$ git grep 'import \*'
dev_notes.md:from nbdev.export import *
nbs/00d_doc_utils.ipynb:    "from nbdev import *"
nbs/01_bbox_canvas.ipynb:    "from nbdev import *"
nbs/01_helpers.ipynb:    "from nbdev import *\n",
nbs/01a_datasets.ipynb:    "from nbdev import *"
nbs/01b_dataset_video.ipynb:    "from nbdev import *"
nbs/01b_tutorial_image_classification.ipynb:    "from nbdev import *"
nbs/01c_tutorial_bbox.ipynb:    "from nbdev import *"
nbs/01d_tutorial_video_annotator.ipynb:    "from nbdev import *"
nbs/02_navi_widget.ipynb:    "from nbdev import *\n",
nbs/02a_right_menu_widget.ipynb:    "from nbdev import *"
nbs/02b_grid_menu.ipynb:    "from nbdev import *"
nbs/03_storage.ipynb:    "from nbdev import *\n",
nbs/04_bbox_annotator.ipynb:    "from nbdev import *"
nbs/05_image_button.ipynb:    "from nbdev import *"
nbs/06_capture_annotator.ipynb:    "from nbdev import *"
nbs/07_im2im_annotator.ipynb:    "from nbdev import *"
nbs/13_datasets_legacy.ipynb:    "from nbdev import *\n",
nbs/15_coordinates_input.ipynb:    "from nbdev import *"
nbs/16_custom_buttons.ipynb:    "from nbdev import *"
nbs/18_bbox_trajectory.ipynb:    "from nbdev import *\n",
nbs/19_bbox_video_annotator.ipynb:    "from nbdev import *"
scripts/check_lint.sh:# F403 -> 'from module import *' used; unable to detect undefined names
scripts/check_lint.sh:# Cause : "from nbdev import *"

The use of import * should be avoided at all costs in Python code unless there is an extremely good reason to use it. For anything close to a library I do not think there are any.

Improve image labeling for large number of classes

Motivation

ipyannotator currently support image labeling. However, for data sets with a very large number of classes it's very difficult to
quickly match the image to the right class.

Showing an visual representation for all possible classes and there textual description right next to the image could considerable improve the process. Currently only a textual or a visual representation can be displayed.

Explore the current difficulties

  • run the notebook nbs/01b_tutorial_image_classification.ipynb with the data set dataset = 'oxford_flowers'

image

possible improvements:

  • make it easy to show the class name instead of the number (requires mapping from class id to class name)
  • show both visual and textual description
  • if the data set is already annotated, provide an option to take the visual example right from the data set

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.