Code Monkey home page Code Monkey logo

biocrowd's Introduction

Notice: This repository is no longer being maintained. Please see CrowdTruth-core for the latest release of the framework.

Latest Stable Version Build Status Code Coverage Scrutinizer Code Quality

The CrowdTruth Framework implements an approach to machine-human computing for collecting annotation data on text, images, sounds and videos. The approach is focussed specifically on collecting gold standard data for training and evaluation of cognitive computing systems. The original framework was inspired by the IBM Watson project for providing improved (multi-perspective) gold standard (medical) text annotation data for the training and evaluation of various IBM Watson components, such as Medical Relation Extraction, Medical Factor Extraction and Question-Answer passage alignment.

The CrowdTruth framework supports the composition of CrowdTruth gathering workflows, where a sequence of micro-annotation tasks can be configured and sent out to a number of crowdsourcing platforms (e.g. CrowdFlower and Amazon Mechanical Turk) and applications (e.g. Expert annotation game Dr. Detective). The CrowdTruth framework has a special focus on micro-tasks for knowledge extraction in medical text (e.g. medical documents, from various sources such as Wikipedia articles or patient case reports). The main steps involved in the CrowdTruth workflow are:

  1. Exploring & processing of input data
  2. Collecting of annotation data
  3. Applying disagreement analytics on the results

These steps are realised in an automatic end-to-end workflow, that can support a continuous collection of high quality gold standard data with feedback loop to all steps of the process. Have a look at our presentations and papers for more details on the research.

Using CrowdTruth

Start using CrowdTruth right now, completely free, and explore all its possiblities. Follow the installation guide to get started, or check out our wiki for all documentation of the platform. We have some crowdsourcing templates ready for you to start with.

biocrowd's People

Contributors

bouncer avatar c-martinez avatar merelvanempel avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

biocrowd's Issues

Extend the CT API so batches can be edited

It would be great if batches could be edited after they are uploaded and running.
For instance, adding more images to the batch or removing images that are sufficiently annotated in terms of this particular task.

Division by zero error

When creating a new user:

Division by zero (View: /var/www/BioCrowd/app/views/sidebar.blade.php) (View: /var/www/BioCrowd/app/views/sidebar.blade.php)

Line number 35

Improve image annotations

Requirements:

  • Clicking and getting a dot instead of a square
  • Drag to create a bounding box
  • Avoid 'Save' dialog after drawing a bounding box

If annotorious won't do it, consider using these libraries:

Or even just doing our own based on HTML5 canvas:

Pausing tasks

Provide an option to pause uncompleted tasks and resume them later.

This is useful to send the users e-mails to remind them to finish and to see how many people are not finishing tasks. Being able to interrupt a task or recover from browser crash are other possible advantages.

This feature might be a lot of effort for very little benefit and needs to be discussed.

Campaigns

Campaigns feature needs to be implemented.

Loosely defined, a campaign is a chain of tasks to be completed one after the next. A more structured definition of campaigns is required.

JQuery.js

Jquery.js file on public is blank and should be removed.

But removing it causes a bug on firefox. Fix the bug then remove it.

Game scoring

Scoring system needs to be defined and implemented. The following considerations need to be taken into account:

  • When and how many points are given to each user for each task?
  • Are level, task type (or perhaps even specific task!) factors which affect the number of points given?
  • When, why and how many points are substracted from a user?
  • When and in which format are scores displayed?
    • As a total score on the user mini-profile (next to name and level)
    • On the leader board?
    • On the 'last 5 days' trend graph (number of points per day) ?
    • On a summary page (+X points for task A, +Y points for task B, -Z points for task C).

Document!

At the moment, code base is still quite small, but we need to keep documentation up to date as much as possible. It might be easier if we have good documentation since the beginning and we add to it as the code base grows, rather than waiting until the code base is larger and the documentation effort will be huge.

Clean DatabaseSeeder

When we are ready to make a first 'stable' release, we should:

  • Remove DevelopDBSeeder for final release
  • Set $adminUser and $adminPassword to some 'reasonable' initial value (and document it somewhere!)

Storing the campaign progress in a better way

Right now, a next campaign is selected based on the amount of judgements a user has for this campaign for each game in the campaign.
The game in the campaign that has the least amount of judgements is then chosen as the next game.
This system is good in theory, but it requires each judgement in the judgements table to have a campaign_id and a game_id to be added to it, and if a user finishes a game that is in 2 campaigns, we want both campaigns to be updated.
This means that two judgements need to be made: one for campaign A and one for campaign B.
Two judgements are made, but only one judgement is really done: the user only makes one judgement for one image by clicking the submit button once.
This is ugly and should be done in a better way.
Maybe make a seperate table with user_id, judgement_id, game_id and campaign_id, and then having the same judgement_id appear twice in this table if it was for two different campaigns.

Level unlocking

Remove locks when user can play the level of games.

To be defined:

  • Whether a player an or cannot play games of a higher level
  • When players 'Level up' and what it implies

More intuitive / generic names & descriptions.

VesExGameType and CellExGameType could (should? must?) have more intuitive / generic names and descriptions.

Currently these GameTypes are (according with their descriptions) used for:

  • Extracting cells from microscopic images
  • Extracting vesicles from microscopic images

However, the could have a more generic description, for example "Adding box annotation to images".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.