Code Monkey home page Code Monkey logo

crowdom's Introduction

Crowdom

Crowdom is a tool for simplifying data labeling.

Write plain Python code and launch data labeling without knowledge of crowdsourcing and underlying platform (Crowdom uses Toloka as a platform for publishing tasks for workers). Define task you solve and load source data with few lines of code, choose quality-cost-speed tradeoff in interactive UI form, launch data labeling, study result labeling in Pandas dataframes.

Quickstart

We recommend you to look first at image classification example, since it demonstrates full data labeling workflow, proposed in Crowdom, with detailed explanations for each step.

In other examples, you can see how working with data labeling looks like for different types of tasks with use of Crowdom.

Join our Telegram chat if you want to learn more about the Crowdom or discuss your task with us.

Types of tasks

Tasks in Crowdom are divided into two types:

  • Classification tasks, which have a fixed set of labels as output.
  • Annotation tasks, for which output has "unlimited" dimension.

In a typical classification task, worker is proposed to make a choice of one of the pre-determined options. Side-by-side (SbS) comparison is a special case of classification task.

As for annotation task, there may be many potential solutions, and there may be more than one correct one. Speech transcription, image annotation are examples of annotation tasks.

Examples

The following table contains list of examples, which demonstrates data labeling for different types of tasks, as well as other aspects of data labeling workflow.

Examples are presented as .ipynb files, located in this repository, but displayed by nbviewer, which do it more precisely than GitHub.

Image classification and audio transcript) examples also have .html versions. These examples present full labeling workflow, corresponding two classification and annotation types of tasks respectively. .html allows to collapse optional sections in notebook to simplify understanding of main steps of workflow, as well as to display interactive widgets contents (for example, to display quality-cost-speed tradeoff interactive form).

Example Full workflow Function Data types Additionally
Image classification (HTML) โœ… Classification Image
Audio transcript (HTML) โœ… Annotation Audio, Text
Audio transcripts SbS SbS Audio, Text
Voice recording Annotation Text, Audio Media output, checking annotations by the ML model
Audio transcript, extended Annotation Text, Audio Custom task UI, custom task duration calculation, first annotations attempts by the ML model
MOS Classification Audio MOS algorithm usage example
Audio questions Classification Audio Output label set depending on the input data
Experts registration Registration of your private expert workforce
Task update Task update (instructions, UI and etc.)
Async launch Non-blocking or parallel labeling launches

Communication

Join our communities if you have questions about Crowdom or want to discuss your data labeling task.

crowdom's People

Contributors

futujaos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.