On the main page it's said that Emy is using persistent storage for the fingerprints,

Emy hits the memory limit about soundfingerprinting HOT 1 CLOSED

uoziod commented on May 24, 2024

Emy hits the memory limit

from soundfingerprinting.

Comments (1)

AddictedCS commented on May 24, 2024

From the email reply:

A short answer: Emy never intended to store large amounts of audio content (more than 1,000 hours), as its main goal was to precisely identify the start-end location of the match which is highly sought in other business domains. Building a Shazam style storage requires not only changes in the storage but also in the fingerprint signature, which is not in scope of this project.

A longer answer: Emy uses locality sensitive hashing to cross-match audio fingerprints. Any LSH-based algorithm (or more broadly any approximate nearest neighbor algorithm!) uses a RAM based storage, that is an inherent limitation of it (be it faiss, HNSW, or literally any other ANN algo, with a notable exception of DiskANN, which may or may not be suitable for SoundFingerprinting). Rewriting the storage is nowhere near a trivial task that I'm not interested in pursuing at this point due to reasons described in the short answer.

You can take a look at Milvus and PaddleSearch and see if their implementation uses less RAM and suits your purposes: https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/audio_searching
AFAIK they face the exact same dilema and their demo states that for CN-Celeb a dataset of mere 1024 hours of content they needed a 132GB RAM machine.

from soundfingerprinting.

Emy hits the memory limit about soundfingerprinting HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent