Code Monkey home page Code Monkey logo

ssl_audio's Introduction

Self-Supervised Learning for Audio Inference

This repository contains the codebase for all experiments for the research project Self-Supervised Learning for Audio Inference.

Abstract:
This report presents Audio Barlow Twins, a novel self-supervised audio representation learning approach. Audio Barlow Twins adapts the Barlow Twins [Zbontar et al., 2021] objective from Computer Vision (CV) to the audio domain. The Barlow Twins objective encourages the empirical cross-correlation matrix between the embeddings of two distorted views of a data sample towards the identity matrix, thereby both enforcing invariance to the applied data augmentations as well as explicitly preventing representational collapse through decorrelation of the individual components of the embedding vectors (redundancy-reduction). Audio Barlow Twins utilises a combination of different data augmentations, all of which act directly on the audio data preprocessed as mel-spectrograms, and considers both convolutional and Transformer encoder architectures. We pre-train on the large-scale audio dataset AudioSet, and evaluate the quality of the learned representations on 18 tasks from the HEAR 2021 Challenge (https://arxiv.org/abs/2203.03022) [Turian et al., 2021], achieving results on a par with the current state-of-the-art for instance discrimination self-supervised approaches to audio representation learning. The impact of the individual components of the learning framework are analysed through extensive ablation studies.

This work has been adapted to a paper Audio Barlow Twins: Self-Supervised Audio Representation Learning [Anton et al., 2022] and has been accepted at ICASSP 2023.

The full written report can be found in the file ABT_full_report.pdf.

ssl_audio's People

Contributors

jonahanton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

zhaoyuwang1

ssl_audio's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.