Code Monkey home page Code Monkey logo

pranavgupta2603 / simclr-urbansound8k Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 42 KB

Explore advanced audio classification with SimCLR-UrbanSound8K. This repository applies SimCLR for urban sound categorization using the UrbanSound8K dataset, demonstrating state-of-the-art techniques in deep learning and audio analysis

License: MIT License

Jupyter Notebook 100.00%
audio audio-classification music self-supervised-learning spectrogram urbansound8k

simclr-urbansound8k's Introduction

SimCLR Implementation for UrbanSound8K Classification ๐ŸŽต๐Ÿ™๏ธ๐Ÿค–

Introduction

This project implements the SimCLR (Simple Framework for Contrastive Learning of Visual Representations) architecture for the classification of urban sounds using the UrbanSound8K dataset. The goal is to accurately classify different urban sounds like sirens, car horns, etc., using advanced deep learning techniques.

Dataset ๐Ÿ“

The dataset used for this project is the UrbanSound8K dataset. This dataset consists of Mel-Spectrogram images, which are a visual representation of the audio data, suitable for our SimCLR model.

Code for Audio to Spectrogram Conversion ๐Ÿ”„

The conversion of audio to Mel-Spectrogram images is performed using a code available in this GitHub repository: UrbanSound8k-MelSpectrogram. This is crucial for preparing the dataset in a format that our model can process.

Architecture ๐Ÿ—๏ธ

  • SimCLR Framework: A self-supervised learning model used to learn representations of audio data.
  • Classifier: A neural network that classifies audio based on the representations learned by SimCLR.

Hyperparameters โš™๏ธ

  • Epochs: 15
  • Number of Folds: 10 (Cross-validation approach)
  • Batch Size: 32
  • Learning Rate: 0.001
  • Weight Decay: 1e-6
  • Optimizer: Adam
  • Loss Function: NTXentLoss (Contrastive Loss)

Outputs ๐Ÿ“Š

The model was trained across multiple folds, showing consistent improvement in accuracy. Here are some highlights:

  • Validation Accuracy: Ranges around 65% to 81%, varying across different epochs and folds.

Conclusion ๐ŸŽ‰

This implementation showcases the effectiveness of SimCLR in a non-traditional domain like urban sound classification. The model achieves promising results, illustrating the power of self-supervised learning in audio processing.


simclr-urbansound8k's People

Watchers

 avatar

Forkers

cv-nerd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.