Code Monkey home page Code Monkey logo

Comments (5)

ffont avatar ffont commented on June 8, 2024

I agree with what you propose.
You probably want to have these scores precomputed in a property of Annotation model because otherwise its probably complicated to compute all the scores in real time (specially if score is complex to compute).

I suggest you to start implementing a function which given an annotation returns a "priority score". This could be a method of Annotation class.

from freesound-datasets.

xavierfav avatar xavierfav commented on June 8, 2024

For now, the prioritization is based on votes:
Annotations that have at least one vote are prioritized.

In order to include the other constrains listed in this post, we need some Freesound metadata that we don't have in the current platform (ratings, nb of donwloads).

@ffont
Should we use the API to get this data. Or should I load it into our model so we have it in FSD platform?

Moreover, about the first point: vote all annotation candidate for a sound (in order to get closer to "complete" annotation for a sound)
I would say that as it is now, it is not worth to do it: because we did not work on population and prioritizing leaf nodes, we would prioritize annotation that are not worth voting (eg. voting "dog bark", "dog" and "animal"). We should first work on how to populate whenever an annotation is considered as ground truth.
We have been inspecting ambiguous cases with edufonseca (categories with more than one parent) to see whether or not it make sense to distinguish two categories and if it make sense to populate to the different parents or not.

from freesound-datasets.

ffont avatar ffont commented on June 8, 2024

@xavierfav We should use the API to load the data in the FSD platform ;)
There has always been the idea to write this management command that iterates over all sounds and gets data from freesound to store in the JSON field of each sound. I'm not sure if something similar was ever implemented (I guess not). I think this is the way to go, have this command that you can run from time to time to re-sync with Freesound.

When implementing the command, I'd iterate over all sounds in groups of N, and then use the API to make a search restricting the results to the IDs of these sounds (you can "OR" sound IDs in the search filter). Then using the fields param you decide which information you want to get returned and store in the FSD platform. In this way, the number of requests needed is n_sounds/N instead of n_sounds. N could be theoretically set to 150 (max number of search results per page), but the limitation here is the length of the URL (as all filter sound IDs will be in the URL). I think with N=50 should be fine. Otherwise try lower or higher values.

from freesound-datasets.

edufonseca avatar edufonseca commented on June 8, 2024

In the constraints listed in the first comment, it was suggested to prioritize sounds with length < 30 sec. I think we should specify further in this direction. How about prioritizing (apart from the other aforementioned constraints):

  1. sounds with length < 10s (just as in AudioSet). This will presumably imply having more PP and also shorter sounds that, at this point, may be more useful.
  2. when the above are over, sounds with length < 20s
  3. when the above are over, sounds with length < 30s

from freesound-datasets.

xavierfav avatar xavierfav commented on June 8, 2024

Sound with length < 10 sec are prioritized #70

from freesound-datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.