Code Monkey home page Code Monkey logo

neuralnote's Introduction

NeuralNote

NeuralNote is the audio plugin that brings state-of-the-art Audio to MIDI conversion into your favorite Digital Audio Workstation.

  • Works with any tonal instrument (voice included)
  • Supports polyphonic transcription
  • Supports pitch bends
  • Lightweight and very fast transcription
  • Can scale and time quantize transcribed MIDI directly in the plugin

Install NeuralNote

Download the latest release for your platform here (Windows and macOS ( Universal) supported)!

Currently, only the raw .vst3, .component (Audio Unit), .app and .exe (Standalone) files are provided. Installers will be created soon. In the meantime, you can manually copy the plugin/app file in the appropriate directory. The code is signed on macOS, but not on Windows, so you might have to perform few extra steps in order to be able to use NeuralNote on Windows (to be documented soon).

Usage

UI

NeuralNote comes as a simple AudioFX plugin (VST3/AU/Standalone app) to be applied on the track to transcribe.

The workflow is very simple:

  • Gather some audio
    • Click record. Works when recording for real or when playing the track in a DAW.
    • Or drop an audio file on the plugin. (.wav, .aiff and .flac supported)
  • The midi transcription instantly appears in the piano roll section. Play with the different settings to adjust it.
  • Export the MIDI transcription with a simple drag and drop from the plugin to a MIDI track.

Watch our presentation video for the Neural Audio Plugin competition here.

NeuralNote uses internally the model from Spotify's basic-pitch. See their blogpost and paper for more information. In NeuralNote, basic-pitch is run using RTNeural for the CNN part and ONNXRuntime for the feature part (Constant-Q transform calculation + Harmonic Stacking). As part of this project, we contributed to RTNeural to add 2D convolution support.

Build from source

Requirements are: git, cmake, and your OS's preferred compiler suite.

Use this when cloning:

git clone --recurse-submodules --shallow-submodules https://github.com/DamRsn/NeuralNote

The following OS-specific build scripts have to be executed at least once before being able to use the project as a normal CMake project. The script downloads onnxruntime static library (that we created with ort-builder) before calling CMake.

macOS

$ ./build.sh

Windows

Due to a known issue, if you're not using Visual Studio 2022 (MSVC version: 19.35.x, check cl output), then you'll need to manually build onnxruntime.lib like so:

  1. Ensure you have Python installed; if not, download at https://www.python.org/downloads/windows/

  2. Execute each of the following lines in a command prompt:

git clone --depth 1 --recurse-submodules --shallow-submodules https://github.com/tiborvass/libonnxruntime-neuralnote ThirdParty\onnxruntime
cd ThirdParty\onnxruntime
python3 -m venv venv
.\venv\Scripts\activate.bat
pip install -r requirements.txt
.\convert-model-to-ort.bat model.onnx
.\build-win.bat model.required_operators_and_types.with_runtime_opt.config
copy model.with_runtime_opt.ort ..\..\Lib\ModelData\features_model.ort
cd ..\..

Now you can get back to building NeuralNote as follows:

> .\build.bat

IDEs

Once the build script has been executed at least once, you can load this project in your favorite IDE (CLion/Visual Studio/VSCode/etc) and click 'build' for one of the targets.

Reuse code from NeuralNote’s transcription engine

All the code to perform the transcription is in Lib/Model and all the model weights are in Lib/ModelData/. Feel free to use only this part of the code in your own project! We'll try to isolate it more from the rest of the repo in the future and make it a library.

The code to generate the files in Lib/ModelData/ is not currently available as it required a lot of manual operations. But here's a description of the process we followed to create those files:

  • features_model.onnx was generated by converting a keras model containing only the CQT + Harmonic Stacking part of the full basic-pitch graph using tf2onnx (with manually added weights for batch normalization).
  • the .json files containing the weights of the basic-pitch cnn were generated from the tensorflow-js model available in the basic-pitch-ts repository, then converted to onnx with tf2onnx. Finally, the weights were gathered manually to .npy thanks to Netron and finally applied to a split keras model created with basic-pitch code.

The original basic-pitch CNN was split in 4 sequential models wired together, so they can be run with RTNeural.

Roadmap

  • Improve stability
  • Save plugin internal state properly, so it can be loaded back when reentering a session
  • Add tooltips
  • Build a simple synth in the plugin so that one can listen to the transcription while playing with the settings, before export
  • Allow pitch bends on non-overlapping parts of overlapping notes
  • Support transcription of mp3 files

Bug reports and feature requests

If you have any request/suggestion concerning the plugin or encounter a bug, please file a GitHub issue.

Contributing

Contributions are most welcome! If you want to add some features to the plugin or simply improve the documentation, please open a PR!

License

NeuralNote software and code is published under the Apache-2.0 license. See the license file.

Third Party libraries used and license

Here's a list of all the third party libraries used in NeuralNote and the license under which they are used.

Could NeuralNote transcribe audio in real-time?

Unfortunately no and this for a few reasons:

  • Basic Pitch uses the Constant-Q transform (CQT) as input feature. The CQT requires really long audio chunks (> 1s) to get amplitudes for the lowest frequency bins. This makes the latency too high to have real-time transcription.
  • The basic pitch CNN has an additional latency of approximately 120ms.
  • Very few DAWs support audio input/MIDI output plugins as far as I know. This is partially why NeuralNote is an Audio FX plugin (audio-to-audio) and that MIDI is exported via drag and drop.

But if you have ideas please share!

Credits

NeuralNote was developed by Damien Ronssin and Tibor Vass. The plugin user interface was designed by Perrine Morel.

neuralnote's People

Contributors

damrsn avatar tiborvass avatar jatinchowdhury18 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.