First of all thanks to the contributors of this library! I'm current

Yes, you can initialize MonoLoader and <code class="n

that's the expected return value for <a href="https://github.com/MTG/essentia/blob/493

sorry <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The simplest approach would be to modify this to receive a list of files to pro

Processing batches of audio files through Essentia-Tensorflow pre-trained models about essentia HOT 8 OPEN

burstMembrane commented on September 25, 2024 1

Processing batches of audio files through Essentia-Tensorflow pre-trained models

from essentia.

Comments (8)

palonso commented on September 25, 2024

Yes, you can initialize MonoLoader and TensorflowPredictVGGish outside the inference loop:

from essentia.standard import MonoLoader, TensorflowPredictVGGish
audio_paths = ["file1.wav", "file2.wav"]

loader = MonoLoader()
model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

for audio in audio_paths:
    loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
    audio = loader()
    embeddings = model(audio)

from essentia.

Galvo87 commented on September 25, 2024

Yes, you can initialize MonoLoader and TensorflowPredictVGGish outside the inference loop:

from essentia.standard import MonoLoader, TensorflowPredictVGGish
audio_paths = ["file1.wav", "file2.wav"]

loader = MonoLoader()
model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

for audio in audio_paths:
    audio = loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
    embeddings = model(audio)

loader = MonoLoader()
print(loader)

returns TypeError: __str__ returned non-string (type NoneType).

It seems loader.configure() is not behaving well, it always returns None, also in your code above.

from essentia.

palonso commented on September 25, 2024

that's the expected return value for configure.

from essentia.

Galvo87 commented on September 25, 2024

Ok got it, but I still don't understand how this could work out...

from essentia.

palonso commented on September 25, 2024

sorry @Galvo87!
It was a mistake in my example script.
I've updated the script and double-checked that it works.

The loader had to be configured first and then called.

from essentia.

jbm-composer commented on September 25, 2024

@burstMembrane, did you find a good solution for batch processing? I have 8 GPUs and want to extract a bunch of embeddings as quickly as possible

I noticed the "batch_size" argument, but it seems like that has to do with how many "patches" it will process from the input audio file, rather than an option to batch-process multiple audio files.

Any tips appreciated.

from essentia.

palonso commented on September 25, 2024

The simplest approach would be to modify this script to receive a list of files to process with something like argparse.

import argparse
from essentia.standard import MonoLoader, TensorflowPredictVGGish

def main(audio_paths):
    loader = MonoLoader()
    model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

    for audio in audio_paths:
        loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
        audio = loader()
        embeddings = model(audio)

        # save the embeddings ...

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Process audio files using VGGish model")
    parser.add_argument("audio_files", nargs="+", help="List of audio files to process")
    args = parser.parse_args()
    main(args.audio_files)
)

then you can divide the filelist you want to process in 8 chunks, (e.g., split -n l/8 -d filelist filelist_part)

Finally you can launch one script per GPU:

CUDA_VISIBLE_DEVICES=0 python extract_embeddings.py $(< filelist_part00)
...
CUDA_VISIBLE_DEVICES=7 python extract_embeddings.py $(< filelist_part07)

from essentia.

jbm-composer commented on September 25, 2024

Thanks, yes, I actually realized there was something similar I could do, in just chunking my data into my GPU-count chunks (8) and having a separate serial process for each GPU. Works well. (I also used batchSize=-1, which think helps optimize a bit, though I'm not totally sure about that one.)

from essentia.

Processing batches of audio files through Essentia-Tensorflow pre-trained models about essentia HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent