Code Monkey home page Code Monkey logo

lucala / siamese-siren Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 71.56 MB

Repository containing code for Siamese SIREN: Audio Compression with Implicit Neural Representations. Published as a workshop paper at ICML 2023 neural compression workshop.

Home Page: https://lucala.github.io/siamese-siren/

License: MIT License

HTML 37.90% Python 62.10%
audio-compression audio-processing implicit-neural-representation machine-learning siamese-neural-network

siamese-siren's Introduction

Siamese SIREN: Audio Compression with Implicit Neural Representations

Repository Overview

The repository contains the following files:

  • dataset.py: downloads and prepares the data, either GTZAN or LibriSpeech (note: GTZAN needs to be downloaded manually)
  • model.py: contains the positional encoding, siren, sine layer, and siamese siren definitions
  • model_config.py: contains different model configurations for easier sweeping
  • run_sweep.py: sweeps over different model configurations by training on a dataset and stores various results in npy files
  • train_and_eval_siam.py: trains a (Siamese) SIREN on a single sample and outputs various results.
  • environment.yml: contains the anaconda environment with python package dependencies

Environment Setup

Run the following snippet inside the project directory:


conda env create --file environment.yml
conda activate siamese-siren

Quick intro to Implicit Neural Representations (INRs)

INRs can be used to represent (and store) data. If we think of an audio waveform as a function f, where f(t) is the wave amplitude at time t, we can try to learn this function. We normalize t into the range [-1,1]. An INR learns to approximate this function f. Therefore, we can store the weights of the learned INR which now encodes the audio. When we want to reconstruct the audio we simply evaluate the INR for t in the interval [-1, 1]. INRs have some interesting properties (e.g. they are resolution-invariant) โ€“ the weights of the INR do not change depending on the sampling rate of the audio, and we can sample dynamically at any arbitrary sampling rate at runtime.

siamese-siren's People

Contributors

lucala avatar

Watchers

 avatar  avatar

siamese-siren's Issues

[REQ] add a (GH-compliant) license file

Hi there, 1st of all thanks for your awesome work !

Since we've "doxed" it in our HyMPS project (under the AUDIO \ AI-based projects page \ Codecs subsection), can you please add a "GH-standardized" license file ?

Expliciting licensing terms is extremely important to let other devs (and not only) understand how to reuse/adapt/modify your code in other open projects and vice-versa.

Although it may sounds like a minor aspect, license file omission obviously causes an inconsistent generation of the relative badge too:


(generative URL: https://flat.badgen.net/github/license/lucala/siamese-siren/?label=LICENSE)

Anyway you can easily set a "compliant" one through the GH's license wizard tool.

Last but not least, let us know how - in your opinion - we could improve categorization/sorting of collected projects in order to push their evolution by favouring collaborations between developers (and not only).

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.