Code Monkey home page Code Monkey logo

Comments (8)

palonso avatar palonso commented on September 25, 2024

Yes, you can initialize MonoLoader and TensorflowPredictVGGish outside the inference loop:

from essentia.standard import MonoLoader, TensorflowPredictVGGish
audio_paths = ["file1.wav", "file2.wav"]

loader = MonoLoader()
model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

for audio in audio_paths:
    loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
    audio = loader()
    embeddings = model(audio)

from essentia.

Galvo87 avatar Galvo87 commented on September 25, 2024

Yes, you can initialize MonoLoader and TensorflowPredictVGGish outside the inference loop:

from essentia.standard import MonoLoader, TensorflowPredictVGGish
audio_paths = ["file1.wav", "file2.wav"]

loader = MonoLoader()
model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

for audio in audio_paths:
    audio = loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
    embeddings = model(audio)
loader = MonoLoader()
print(loader)

returns TypeError: __str__ returned non-string (type NoneType).

It seems loader.configure() is not behaving well, it always returns None, also in your code above.

from essentia.

palonso avatar palonso commented on September 25, 2024

that's the expected return value for configure.

from essentia.

Galvo87 avatar Galvo87 commented on September 25, 2024

Ok got it, but I still don't understand how this could work out...

from essentia.

palonso avatar palonso commented on September 25, 2024

sorry @Galvo87!
It was a mistake in my example script.
I've updated the script and double-checked that it works.

The loader had to be configured first and then called.

from essentia.

jbm-composer avatar jbm-composer commented on September 25, 2024

@burstMembrane, did you find a good solution for batch processing? I have 8 GPUs and want to extract a bunch of embeddings as quickly as possible

I noticed the "batch_size" argument, but it seems like that has to do with how many "patches" it will process from the input audio file, rather than an option to batch-process multiple audio files.

Any tips appreciated.

from essentia.

palonso avatar palonso commented on September 25, 2024

The simplest approach would be to modify this script to receive a list of files to process with something like argparse.

import argparse
from essentia.standard import MonoLoader, TensorflowPredictVGGish

def main(audio_paths):
    loader = MonoLoader()
    model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")

    for audio in audio_paths:
        loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
        audio = loader()
        embeddings = model(audio)

        # save the embeddings ...

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Process audio files using VGGish model")
    parser.add_argument("audio_files", nargs="+", help="List of audio files to process")
    args = parser.parse_args()
    main(args.audio_files)
)

then you can divide the filelist you want to process in 8 chunks, (e.g., split -n l/8 -d filelist filelist_part)

Finally you can launch one script per GPU:

CUDA_VISIBLE_DEVICES=0 python extract_embeddings.py $(< filelist_part00)
...
CUDA_VISIBLE_DEVICES=7 python extract_embeddings.py $(< filelist_part07)

from essentia.

jbm-composer avatar jbm-composer commented on September 25, 2024

Thanks, yes, I actually realized there was something similar I could do, in just chunking my data into my GPU-count chunks (8) and having a separate serial process for each GPU. Works well. (I also used batchSize=-1, which think helps optimize a bit, though I'm not totally sure about that one.)

from essentia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.