Code Monkey home page Code Monkey logo

openmic-2018's Introduction

openmic-2018

Tools and tutorials for the OpenMIC-2018 dataset.

Build Status

Coverage Status

Overview

This repository contains companion source code for working with the OpenMIC-2018 dataset, a collection of audio and crowd-sourced instrument labels produced in a collaboration between Spotify and New York Universiy's MARL and Center for Data Science. The cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.

If you use this dataset, please cite the following work:

Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018. pdf

Download the Dataset

The OpenMIC-2018 dataset is made available on Zenodo. After downloading, decompress with your favorite commandline tar utility:

$ tar xvzf openmic-2018-v1.0.0.tgz -C some/dir

This will expand into some/dir/openmic-2018, with the following structure:

openmic-2018/
  acknowledgement.md
  audio/
    000/
      000046_3840.ogg
      ..
    ..
  checksums
  class-map.json
  license-cc-by.txt
  openmic-2018-aggregated-labels.csv
  openmic-2018-individual-responses.csv
  openmic-2018-metadata.csv
  openmic-2018.npz
  partitions/
    train01.txt
    test01.txt
  vggish/
    000/
      000046_3840.json
      ..
    ..

The openmic-2018.npz is a Python-friendly composite of the vggish features and the openmic-2018-aggregated-labels.csv. An example of how to train and evaluate a model is provided in a tutorial notebook.

Installing

To use the provided openmic Python library, first clone the repository and change directory into it:

$ git clone https://github.com/cosmir/openmic-2018.git
$ cd ./openmic-2018

Next, you'll want to pull down the VGGish model parameters via the following script.

$ ./scripts/download-deps.sh

Finally, you can now install the Python library, e.g. with pip:

$ pip install .

Errata

When initially collecting data, ten audio files were corrupted due to an issue in the source FMA dataset:

'071826', '071827', '087435', '095253', '095259',
'095263', '102144', '113025', '113604', '138485'

Of the 41k responses obtained, only three resulted in erroneous labels by annotators. The following rows have been manually corrected:

Sample Key Instrument True Label
095253_134400 piano yes
095263_96000 mallet percussion yes
113025_99840 trumpet yes

openmic-2018's People

Contributors

bmcfee avatar ejhumphrey avatar simondurand avatar simondurand123 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.