Code Monkey home page Code Monkey logo

freesound-datasets's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

freesound-datasets's Issues

Design of subsets for FSD 1st release

The first release of FreesoundDataset (FSD) could have several subsets. For every subset, there will be a balanced version (all sound classes have the same number of annotations), and an unbalanced version. Appropriate splits will be done for every version (tr/val/te or tr/te), including stratification for the unbalanced version. Also, user filtering or pack filtering could be considered, so that one user or pack is part of one set only. For now, 2 subsets are in mind: Full and Medium.

  1. Full. The motivation is to keep as many categories as possible (ideally 632 of AudioSet ontology), so that the ontology is (almost) fully covered, avoiding data-driven “holes” that can be perpetuated in time. This will imply having a small number of agreed annotations in some of them. This is probably not enough for certain machine learning techniques, e.g., deep learning, but it can be enough for other approaches. Also, this is an intermediate step for the Medium subset.
  • Features:
    • Balanced version includes a small number of annotations, e.g., 10 in each category
    • Unbalanced version includes all available categories with all the annotations. Splits will require stratified sampling.
  1. Medium. The idea is to have a smaller number of categories that allow us to have a larger number of annotations in each of them. For instance, now we have selected 398 categories, where in each one of them we have at least 72 validated annotations, from which at least 30 or 40 are Present. The number of categories and annotations per category is to be decided.
  • Features:
    • Balanced version includes a medium number of annotations, minimum of 50 or 60, in each category.
    • Unbalanced version includes all the annotations for the selected categories. Splits will require stratified sampling.

Remove duplicated votes

After the Tagathon organized mid April, some duplicates were found in the DB:
Around 1000 votes of 45000 are duplicates. Closer inspection to the data showed that some vote pages were sent twice in when submitting (effect of double click in the "Submit" button maybe?).

The timestamps are slightly different, so I guess we should identify duplicates looking at these fields (vote row of Vote):

  • vote.created_by
  • vote.vote
  • vote.annotation_id
  • vote.created_at (are close)

It is quite easy using the Django ORM, however, backups should be done before launching the cleaning process, and a verification step after the cleaning should be done.

Cleaning process:

for row in Vote.objects.all():
    if Vote.objects.filter(created_by=row.created_by,
            annotation_id = row.annotation_id, 
            created_at__lte=row.created_at+timedelta(seconds=10),  
            created_at__gt=row.created_at-timedelta(seconds=10)).count() > 1:
        row.delete()

For the backup and verification I don't know what is the best way of doing it.

Easy to login with different accounts

The platform allows to login with different services. It is easy to pick different ones.
The problem is that then, an annotation could be verified with a single person, without having an agreement of different users. Moreover, the counting of a person contributions won't be added.

  • One easy solution would be to add a long-term cookie to remember the service the user previously used to login to ensure that the user use the same service to login to our platform (not so much work).
  • Another would be to add functionality to link users from different services (more work around the social authentication functionality).

Clean some migrations

There are two migrations that should be removed because they just cancel each other:

  • 0025_auto_20170706_1903

  • 0026_auto_20170706_1942

Play button is very small

When listening to sounds you have to click directly on the play button, which is very small.
I would like to be able to click the entire sound, but this results in playback starting from the position which I play.

Add User model

In order to store some proprieties about users (eg: is_trustable for quality control), it is needed to add a custom User model.

Using AbstractUser seems to be the best solution in our case.

Split AudioSet ontology into groups for crowdsourcing validations

Feedback gathered in conducted experiments shows that validating annotations of isolated categories may lead to inconsistent ratings, if the most similar categories are not considered. To address this issue, groups of related categories could be (i) made and (ii) suggested to the rater. The goal is to prevent raters from doing any random category by instead proposing small groups composed of siblings and small subfamilies. The groups should be small, taking into account possible lack of commitment in the crowdsourcing approach.

Assignment of these groups to raters may be at the beginning of the procedure, just after showing the guidelines #28, and before passing to the training phase #27, (that would be specific of the group of categories under consideration).

Assignment of these groups to raters may be done by:

  • Suggesting groups to the rater, according to some criteria, e.g., categories with less validations.
  • Letting the rater choose from a list according to his preference.

Hence this issue can be split into two tasks:

  1. split the ontology into meaningful small groups of categories
  2. design the criteria and a system that allows assigning groups to raters

Sometimes only 11 forms appear when validating

I've seen that sometimes the validation form present only 11 sounds to vote when there are still enough annotations to be able to present 12 sounds.

I guess something goes wrong when adding test examples in view function contribute_validate_annotations_category

Filter out some categories from the category choose table

In the table to choose a category to validate annotations from, some categories appearing in the second level should not appear.

For instance, Silence under Source-ambiguous sounds has no children, and is omitted.
It should not be proposed in the table.

Another example, Whistling and his children is omitted and should not appear in the table.

Adding a filter using the omitted field of TaxonomyNode is needed.
Perhaps the best way is to add an option for getting only the non omitted nodes with the get_nodes_at_level method.

Add pagination for navigation in explore view

Currently, the explore view for each category outputs only 10 examples (always the same).
It is needed to be able to explore the data, by adding pagination.

Two explorations could be useful:

  • Exploration of the data automatically-labeled with a category.
  • Exploration of the validated data.

A single pagination page (for each category) could be enough if we add column for votes and a "sorting by" functionality.

Add strategy for prioritizing annotations to be voted

In order to have "the best dataset we can at a time t", we have chosen some constrains [TO BE DISCUSSED]:

  • vote all annotation candidate for a sound (in order to get closer to "complete" annotation for a sound)
  • annotation candidates need 2 identical votes to be considered valid
  • prioritize sound that have a length < 30 sec
  • prioritize sounds with "good quality" (use Freesound downloads and rates? a descriptor for quality?)

We need to implement a "manager" that selects the annotations and the sounds to be voted.
Ideally a rank of priority should be derived from the constrains, and the annotation should be proposed to crowd-workers following this rank.

Show examples per taxonomy category

Store a list of Freesound IDs as selected examples for each category. This must be stored in the taxonomy data field (json data). Show the selected examples as Freesound embeds in the category page.

Selected examples per category can come from current manual annotation process.

Annotation protocol for validation task

Currently, one task is available in the FSD: https://datasets.freesound.org/fsd/contribute/validate_annotations/?help=1

Considering the less controlled scenario that is crowdsourcing, the annotation protocol should be improved before the platform launch. This is a huge task, that could be split into smaller ones. But for now it is defined as one in order to show the big picture and concentrate the discussion.

We assume that the user is familiar with the platform, and has chosen to contribute in this dataset (FSD), for which he/she has clicked on the corresponding button (currently: Validate Category Annotations). The way we see the protocol is:

  1. User is presented with the task instructions or guidelines, either in text or video format (#28), including, at least:
  • Task definition
  • Protocol description, using the platform
  1. User chooses a sound category (either from our priority list or according to his interests) (#29).

  2. Training phase. This would show context of the selected category AND those of the siblings/parents in order to help the rater create his judgement before proceeding with the task. This phase may include:

  • Explicit part:
    • Showing categories' hierarchy, direct children, descriptions, at once, for inspection
    • Showing also good sound examples for every category (#30)
    • This is to be done for the selected category, its siblings and parent(s)
  • Hidden part (of which the rater is not aware), presented within the Validation phase, consisting of one page of sounds with:
    • Good examples (clearly belonging to the category, to train the subject) (#30), and bad examples (probably randomly selected)
    • Use the examples as Quality Control (#31 )
  1. Validation phase. Where the actual task takes place.
  • Quality control should be implemented (#31 ) to make sure gathered annotations are consistent
  • We should have a priority system for the presentation of the annotations to be validated within every category (#23 )
  • Fix number of annotations/category (ie, a number of pages) per rater. Could be a fixed number or fixed range. When it is done, propose options:
    • continue with another category from the list where the initial one was chosen
    • go to beginning

Add criteria for dataset release

Currently, there is no strategy for the creation of a dataset release.

In the file datasets/tasks.py, generate_release_index() should:

  • create a DatasetRelease, and add the Annotations that are considered as correct according to a strategy criteria.
  • create file with the FS ids, ontology labels, ... (already implemented, but maybe the structure of the file may be changed).

Sounds from Freesound are sometimes deleted

It happens that sounds that have been selecting from the mapping, which creates the annotation candidates, are deleted.
Thus, the embeds presented in the validation forms are in this cases not working.

We should create a command for "synchronizing" our platform to Freesound.
We should check that the sounds are still in Freesound. And if they are not, flag them to indicate that for now we don't treat them.
We could still use them because the sound are actually not deleted from the Freesound database, but they are not publicly available.

This command could also get some metadata updated (eg. nb of downloads & ratings), that could be used for prioritizing some annotations #23.

Add verification examples as optional

As said in #30, the task of providing positive examples for all the categories is hard and won't be finished for the platform's launch.

I plan to make this quality control mechanism optional.
And add a boolean field positive_verification_examples_activated in TaxonomyNode.
So we can activate it for the categories that we have examples, and deactivate it for the one that we still don't have enough examples.

For false examples #57, I think it is possible to get them, and have it functional when the platform launches. It can serve as a spam filtering process.
However, I also consider making this process optional and adding a boolean field negative_verification_examples_activated in TaxonomyNode

Designing Explicit part of Training phase

After choosing a sound category, the user is redirected to a new page which is the explicit part of the training phase.

Aim: show context of the selected category AND those of the siblings/parents in order to help the rater create his judgement before proceeding with the task.

In this page we list:

  • Selected category
  • Parent(s) of Selected category
  • Siblings of Selected category

For every category we display:

  • place in the hierarchy
  • description
  • direct children
  • good sound examples (#30)
  • anything else?

WE should specify this more:

  • web design
  • how many examples
  • ?

Add link in category hierarchy path

It would be nice to be able to navigate through categories from the hierarchy path (for both exploring and annotating), like in Audio Set website.

Optimize compute_dataset_taxonomy_stats queries

In compute_dataset_taxonomy_stats(https://github.com/MTG/freesound-datasets/blob/master/datasets/tasks.py#L89) we carry out one single big query to get the number of annotations and the number of sounds for all taxonomy categories, and then we compute one extra small query for each category to get the number of non-validated annotations. There are probably two ways to optimise this:

  • Get all the information for all categories in a single query. We tried to do that but the resulting query took a really long time (~hours) to compute for full sized dataset (i.e., 250k sounds, 500k annotations approx). We reverted back to use separate queries as a quick solution to get this function usable, but maybe this query can be improved and run quickly. One way that it could be surely improved is by adding an is_validated field in the datasets.models.Annotation model which gets updated when new votes for an annotation are created. However, first option would be to try to be fast without needing to store that intermediate value.

  • Get all the information regarding num sounds and num annotations in one single big query (like now), and get all the information regarding num non validated annotations in another single big query (so running 2 big queries instead of 1 + 1 * num categories).

Add access to admin page for editing TaxonomyNode fields

The administration page can be used for editing the content of the database such as informations related to TaxonomyNode.
Here is what we plan:

  • Add a link on a category page for admin users to the administration page of the TaxonomyNode
  • Configure the admin page for allowing editing TaxonomyNode FAQ, and positive/negative examples

Subset of annotations for FSD 1st release

After defining the mapping to create lists of candidate samples to fill the sound categories (#25), we should decide a subset of all mapped annotations where to concentrate the validations in the initial validation task for FSD 1st release. This is related to the design of the subsets #24, in such a way that the chosen subset of annotations should meet the requirements of the data subsets that we want to provide.

A simple example:
Suppose we want the Medium subset of the FSD 1st release to have ~100k annotations with rater agreement. When validating an annotation, raters can say NP or U with certain probability. This means that we should select >100k annotations as starting point.

Discussion/communication channels

Apart from hosting discussions through issues in this repository, we could consider other channels to promote communication:

  1. We could have a mailing list, where people can ask things. This could also be used for announcements.
  2. It was suggested to enable a contact form in the platform, and possibly have a blog where platform users can post.

any suggestions?

Add table for selecting category to validate

In the protocol of the validation task, there is a table that shows up after the guidelines. It is split in 2 parts:

  • Left side: "What category would you like to validate?"

    • Ask user to advance through the first 2 levels of ASO (3 if majority of siblings are also parents) based on his/her interest
    • Present list with resulting subset of categories
  • Right side: "Our priority list:"

    • Present list of categories that require validations the most

In both cases:

  • the presented list is composed of categories are sorted according to predefined numeric score (#36)
  • the first category is pre-selected but it is possible to choose a different one

By clicking Continue, the user navigates to a different page that shows the explicit part of the training phase.

Improve guidelines for validation task

Currently, the validation task available has a set of instructions. When the improved annotation protocol is ready #27 , these guidelines should be updated, considering the crowdsourcing scenario. Providing a short video instead of a text with instructions can be useful to transmit the message in a more appealing and dynamic way (although the corresponding text version could still be available).

The guidelines should focus on the procedure or protocol that the rater will follow, familiarizing him with the web tool, including at least:

  • task description: question and response types
  • training phase explanation
  • validation phase explanation

Add time graph for annotator contributions

It would be nice to have a graph to visualize annotators contributions across time in the contribute page with "Annotators' ranking".
Not so useful, but could maybe motivate people in their progression...

Annotation tasks brainstorming for FSD

Currently, we have only one task available for FSD: a validation task, which consists of validating annotations that were automatically generated through a mapping. More specifically, for a given sound category, the rater is prompted with the question: Is sound category present in the following sounds? .

While at the moment this is the only task, hopefully we will have more in the future. This issue is to brainstorm about other possible tasks of interest. For example:

  1. Define timestamps (start and end times, or onset and offsets) for the instances of acoustic events within an audio sample. The validation task that we already have allows us to evaluate the presence of a sound category in an audio sample, but in many cases the samples are relatively long (up to 90s) and thus we have no knowledge of when exactly the type of sound occurs. These cases can be referred to as weakly labeled data. Defining exact timestamps will turn these cases into strongly labeled data while enabling evaluation for other tasks, e.g., detection of acoustic events on a continuous audio stream.

Test examples in validate category annotation

#44 implemented a User profile model for holding some information about the user.
As well as a method for verifying that the user is able to perform the validation task correctly.

When a user enter to the platform for the first time, he is considered as "non trustable".
When he goes for validating category annotations, in the first page, some test examples are added.
If the user succeed with those examples, he is then considered as "trustable", and his votes will be considered as "trustable". If he does not succeed, his votes will be considered as "non trustable" until he succeed in validating the test examples.

There is one problem in the current implementation:
The "trutable" field is independent of the category. It means that if a user validate the test example of "dog bark", he is also trustable for validating "Soprano Saxophone". This should not be the case.

After discussing with edufonseca, we decided that this solution should be implemented:
When a user start to validate a certain category, test examples are added to the first page.
Once in a while (for now once every five pages), test examples are added.
If the user changes category, test examples are again added (even if he has not done 5 pages).

From the implementation point of view, I think of adding a last_category_annotated field in Profile that allows for knowing which was the last category he annotated (and thus will correspond to the category the user is trustable or not)

Prioritize annotations corresponding to short clips

For now, when a user contributes to the validate category annotations task, the annotations presented to him are the ones that are not considered as ground truth, that he never voted and following this order:
(I am not including verification annotations in purpose to not mess the explanation here)

  1. the ones that have been already voted at least once

If the number of annotations that follow the last criteria is less than 12 (number of annotations to vote in one form),

  1. annotations that have never been voted are added.

It was decided to prioritize the annotations which correspond to clips that have a short duration.
For that purpose, I propose to do the following:

  1. Add the annotations that have been voted at least once
  2. If we don't reach 12, add the annotations that have never been voted and that correspond to clips of a duration < 10 sec
  3. If we don't reach 12, add the remaining annotations

At first, we will concentrate at validating the annotations that have already been voted at least once, so we want quickly to get an agreement. Once there are no annotations with at least one vote, we concentrate on the short ones.


For that, it is needed to add a step in the contribute_validate_annotations_category view function: after selecting the annotations that have been already voted and if the amount obtained does not reach 12.

Examples for training phase of validation task

As mentioned in #27 , ideally, the annotation protocol will consist of a training phase followed by a validation phase. In the former, some representative audio examples should be presented to the rater. How to choose these examples for every category? Several options:

  1. Clips whose validations were rated as PP && with highest Freesound ranking
  2. Randomly chosen clips between those validated as PP. These will vary from rater to rater, thus mitigating bias…. However, in this way, some examples could be not very representative.

Explore FSD release & report wrong annotations

Apart from exploring the content of a current state of a dataset and/or of its available releases, we should also allow users to report faulty audio samples or wrong annotations.

As a first approximation, we thought of enabling some simple mechanism like a thumbs down button that users can click when they notice a fault.

A number of issues are yet to be decided. Some of them are:

  1. Where to place this functionality within the platform? Together (or not) with the Explore functionality?
  2. At which level do we enable this? So far, it has been suggested that it could be implemented at the release level.
  3. How to use this info? e.g., systematically flagged examples can be reallocated in a postprocessing stage and marked for further validation.

Improvements in annotation interface

After discussion with team members, the following changes are requested for the annotation interface:

  • Validations should 4 options instead of three, including 2 levels of "presentness".
  • Annotators should be able to send some feedback comments for a particular category. This should be displayed in the annotate category page as a input box and a "send feedback button.
  • Interface should display a link to the sound in Freesound, but if the link is opened, we should store that that action happened in relation to the vote.
  • Fix bug in player, the right part of it seems to be unreachable (probably because it overlaps with form)
  • Add help icon which links to instructions summary to be shown as modal

False examples for quality control

In order to test if a user can perform the annotation task and to filter spams, we sometimes add test examples in the validation form to see if the user is able to recognize the sound entity. If the user succeed, this information is stored with the votes (is_trustable field).
We need also to add some false examples in the validation form.

Work to be done:

  • Annotate "relevant" false examples here
  • Add a field false_examples in TaxonomyNode
  • Add the false examples the same way positive examples are sometimes added in the validation form and check that the user answers correctly.

code layout cleanup

A few things that might be easier to fix sooner rather than later about making sure we use a more "djangoish" layout for some parts of the application

  • add datasets/urls.py for datasets specific urls, instead of having them in the main urls file
  • Move templates specific to an application into that application - e.g. datasets/templates/datasets/foo.html instead of templates/foo.html

Add Category model

For different reasons (to avoid using the json for holding information about categories and for storing attributes like priority score), adding a class Category to the models is needed.

First, I plan to:

  • Add it to the models
  • Add the attributes that correspond to the field in the ontology.json file
  • Add a ManyToManyField relation to Category in Dataset
  • Replace the value CharField of Annotation with a ForeignKey relation to Category
  • Add a process for loading and creating the instances of Category (from ontology.json file)
  • Maybe replace some methods of Taxonomy that make more sense to have in Category

Later:

  • Add priority score attribute
  • Add an async process for calculating the priority score

Add a command for consolidate examples into Annotations

We are working on adding some examples for each category in order to have a quality control mechanism #30.

Sometimes this examples are not instance of Annotations .
The way that the validation form is created uses Annotations instances that then receive votes.
It would be easier and more logical to have the examples as Annotations instance.

We should add a command for "consolidating" the examples, meaning adding the examples in Annotations.

About page very slow to load

We noticed that the about page (https://datasets.freesound.org/fsd/) is quite slow to load (~7 sec).

The ontology tree might cause this. Creating the nested dictionary that is then used by the tree.js library takes a while.
I am considering adding an Ajax call to load this data asynchronously so that the page will be loaded more quickly.

Some pages are slow to load

Some pages of the website are very low to load. Basically the pages containing TaxonomyNode instances information.
There is an extensive use of the taxonomy_node_data or taxonomy_node_minimal_data template tags that are doing unnecessary queries.

I propose to rewrite some view functions to calculate only what is needed for the different specific pages.
Maybe also defining more minimalistic template tag filter that could allow for example to get only a few information like a TaxonomyNode name or url_id from a node_id.

Score for prioritizing categories to raters

#29 discusses a way of showing a list of sound categories for raters to select. A proposed criterion to sort them is by using a numeric score, as follows.

Score = G - #GT

Where:

  • G is the the Goal, which, for starters, could be the balanced version of Full subset, ie, every category must have G sound clips. For instance, G =10.

  • #GT is the number of annotations that become GT by user agreement, that is, 2 users rating the annotation as Present.

The initial state of a category is: Score = G. At the end, Score = 0 (ie, mission accomplished for the mentioned Goal). When Score is not positive, the category could be blocked (preventing users from choosing the category).

Note:
The Score above should account for the number of votes present in a category; the greater the number of votes, the closer to GT generation.

Add category FAQ

It has been decided to create class specific FAQs that will be shown in the top of validation pages.
FAQ can be added from the admin page in an html format.

Interesting comment may be present in the sheet containing the Freesound examples.

How suitable is AudioSet Ontology for Freesound content?

Interaction with participants of the conducted annotation experiments raised interesting discussions. One of them was to what extent the AudioSet Ontology is suitable for Freesound content, for example:

  • whether the granularity level is appropriate or not,
  • what types of sound are not covered by the ontology (e.g., aliens or zombies), or
  • how certain particularities of few Freesound samples fit in it (imitations, synthesized or heavily processed sounds)

Signalling presence of another label in a audio clip

From using the current validation user interface you can only say whether the audio clip contains an audio class - but it is not possible to report what other labels it contains.

For example, imagine you are given fireworks clips to validate. There might be a number of other labels in it, such as crowds, male speech, female speech, child speech and so on. It would be good to be able to report that those labels also apply to the clip, so that the labelling is more complete.

Have you considered adding this feature?

Change dataset stats displayed

In the platform, we show some statistics about the dataset.
Now that we can vote several time the same sound, and that we have a strategy for ground truth generation. The actual statistics are not reflecting the state of our dataset.

The "% validated annotations" correspond to how much annotation have been voted once.
Also, "validated" sounds like the annotation is rated as "Present" (or no?).

In the taxonomy category table, "% validated annotations" (which was corresponding to the % of annotations that have been voted once) has been change to "# validated annotations", and it is now showing the number of annotations that are considered as "Present" ground truth.

Maybe we should use a different term than "validated". In the paper, this term is also used and refers to the % of annotations that have been voted once (vote being present or not).

Mapping to create list of candidates for sound categories

Currently, lists of candidates for sound categories are the result of a tag-matching process. Freesound tags were manually assigned to the target categories, which were automatically populated with the Freesound clips presenting the aforementioned tags.

This process presents certain limitations that may (or not) be improved:

  • we have some categories with very few annotations.
  • we could have higher precision (currently, around 40% of the annotations are not validated as Present, which wastes annotation effort)
  • perhaps we could also have higher recall (make sure we don't leave Freesound clips out, and all possible annotations are assigned to a clip).

How to improve the mapping?
For instance, utilizing content-based techniques using the annotations already validated as Present. In this respect, @soramas has done some tests and @xavierfav has some proposals.

When the mapping to use is decided (either the current one or another), it will be the starting point for other tasks, eg, #26 . The mapping will be able to populate categories to some extent, with some estimated precision. This will have to be taken into account to determine an initial subset of annotations to focus on for the FSD 1st release.

Quality control

As mentioned in #27 , ideally, the annotation protocol will consist of a training phase followed by a validation phase. Quality control mechanisms must be designed and implemented to ensure that gathered annotations are reliable. Among the possible approaches are:

  • some works focus on computing a quality score for each worker that contributed annotating. However, this may require to gather a significant amount of annotations.

  • A more basic approach: every now and then we could present some annotations for which the response is i) evident and ii) known, to check worker’s response. It is possible that GoodSounds have implemented something in this line.

Further literature review may be required.

Get all Categories for a User is slow

When a user goes to the page for selecting a category to annotate, the priority table lists all the categories that the user can contribute to.

Because a user can vote only once for an annotation, it is possible that there is no more annotation to validate for a specific category.
For now, Dataset.get_categories_to_validate(user) get all the categories and then check the categories one by one to see if there are annotations left for the user to validate. This process is quite slow (takes 10 sec), and with several user using the platform, it could maybe be problematic (just supposition).

We could probably get all the categories with one query.

Map of web

Sketch platform's map of web and its rough content, as we want it to be for the crowdsourcing launch. A simple initial example:

  • Initial welcome page:
    • Description of Freesound Datasets project
    • List of currently hosted datasets with a brief discription and link to a page for each dataset
      • For every dataset:
        • Purpose & main characteristics
        • Available tasks to contribute
        • Releases available
          • For every release: functionalities supported (exploration, reporting, download, etc)

@xavierfav has already begun defining some of these concepts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.