Code Monkey home page Code Monkey logo

gsoc2022-label-buddy's Introduction

Google Summer of Code with GFOSS 🌞

Project: Label Buddy 2.0: Automated audio-tagging using transfer learning

Mentors: Pantelis Vikatos, Agisilaos Kounelis, Ioannis Sina

Past Mentor: Markos Gogoulos

Contributor: Ioannis Prokopiou

Past Contributor: Ioannis Sina

Introduction

An annotation tool helps people (without the need for specific knowledge) to mark a segment of an audio file (waveform), an image or text etc. in order to specify the segment’s properties. Annotation tools are used in machine learning applications such as Natural Language Processing (NLP) and Object Detection in order to train machines to identify objects or text. While there is a variety of annotation tools, most of them lack the multi-user feature (multiple users annotating a single project simultaneously) whose implementation is planned in this project. The audio annotation process is usually tedious and time consuming therefore, these tools (annotation tools which provide the multi-user feature) are necessary in order to reduce the effort needed as well as to enhance the quality of annotations. Since in most tasks related to audio classification, speech recognition, music detection etc., machine and deep learning models are trained and evaluated with the use audio that has previously been annotated by humans, the implementation of such a tool will lead to higher accuracy of annotated files, as they will have been annotated by more than one human, providing a more reliable dataset. In effect, multi-user annotation will reduce the possibility of human error e.g. an occasional mistaken labelling of a segment might be pointed out by another annotator.

Deep learning models can be used for annotation and can kickstart your development effort by enabling faster annotation of datasets for AI algorithms. Deep learning models are sensitive to the data used to train them, this makes it hard to train the deep learning models on a specific dataset and deploy them on a different dataset. As a solution, transfer learning for sound could help adapt pretrained models into various datasets. Deep learning models used for annotation can be tuned and improved by retraining these pretrained models based on new datasets.

Already existing annotation tools:

Label Studio: https://github.com/heartexlabs/label-studio

BAT annotation tool: https://github.com/BlaiMelendezCatalan/BAT

Computer Vision Annotation Tool (CVAT): https://github.com/openvinotoolkit/cvat

Project goals 🎯

This project is an enhancement to the previous work that has been done previously. Its goal is to make annotation simple and easy while also providing a well-defined manager-annotator-reviewer framework. The goal of this project is to use Transfer Learning (TL) approaches to make the annotation process easier for the user by offering label predictions. The golas can be devides in two categories of tasks: Machine Learning and Django.

Machine Learning:

  • Conduct research for the appropriate model architecture
  • Modify the annotation process by integrating the model
  • Test the model by providing evaluation metrics

Django:

  • Add lazy loading for the audio files: load segments of the file when needed (i.e., YouTube). This will lead to better performance when the audio file is too big.
  • Add Django Testing
  • Dockerization
  • Add documentation
  • Add rar file upload functionality - currently, users can only upload zip files (optional)
  • UI improvements (optional)

Steps to run

Clone repository and cd to the folder

git clone https://github.com/eellak/gsoc2022-Label-buddy
cd gsoc2021-audio-annotation-tool

Create virtual enviroment

python3 -m venv env

Activate it for Linux

source env/bin/activate

Activate it for Windows

env\Scripts\activate

Install audiowaveform programm following the given steps: https://github.com/bbc/audiowaveform#installation

Install requirements and cd to label_buddy/

pip install -r requirements.txt
cd label_buddy

Make migrations for the Database and all dependencies

python manage.py makemigrations users projects tasks
python manage.py migrate

After the above process create a super user and run server

python manage.py createsuperuser
python manage.py runserver

Visit http://localhost:8000/admin, navigate to users/[your user] and set can_create_projects to true so you can start creating projects.

Visit https://labelbuddy.io/ and sign with the following credentials:

  • Username: demo
  • Password: labelbuddy123

in order to create projects, upload files and annotate them.

Research & Models Documentation: https://docs.google.com/document/d/10Sd1WPcPpctzXY3lO9p6u4ubNzLg7SEUDKWJv09snqU/edit?usp=sharing

DockerHub Repository: https://hub.docker.com/r/ioannisprokopiou/yoho-training

gsoc2022-label-buddy's People

Contributors

giannisprokopiou avatar ioannissina avatar kounelisagis avatar mgogoulos avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gsoc2022-label-buddy's Issues

Lazy Loading: Reseach

Reaseach to understand the needs and the means to be used in the implementation for the Lazy Loading task.

Add scalable way of running containers

With every new project created a new container will open.
We need to implement a scalable way of opening lots of them with the use of uwsgi, Nginx and docker-compose.

Dockerization with Flask

Dockerize the process with which we can train and predict through the use of docker container for each specific project.

Lazy Loading: Manual Progress Bar

Since wavesurfer on "loading" does not fire with the use of backend: 'MediaElement', we search for a more manual solution using the amount of media loaded.

Model Unit Tests

Add unit tests to be performed at all the models to ensure their functionality.

Add unit tests

Unit testing is a software development process in which the smallest testable parts of an application, called units, are individually and independently scrutinized for proper operation. We want to automate the process of testing all the parts of our application to be sure that we are taking every step with caution.

Create dockerized setup

Dockerizing is the process of packing, deploying, and running applications using Docker containers. Docker is an open source tool that ships your application with all the necessary functionalities as one package. We want to dockerize our app to be able to be used using Docker.

Lazy Loading: Reaserch Audio Segmentation

Research to explore possibilities to split an audio track if it is too big. (annotation easier - load less time/network consuming)

Possibilities:

  1. Simple segmentation with the use of ffmpeg based on segmentation time

  2. SIlence based segmentation with the use of pydub

Update README

README update to be aligned with the 2022 project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.