Code Monkey home page Code Monkey logo

Comments (3)

eziolotta avatar eziolotta commented on June 2, 2024

I'm starting same test of long audio segmentation, considering the speaker's voice activity.
On this fork: https://github.com/eziolotta/rVADfast

But i have same problem with quality of audio output...

from deepspeech-italian-model.

eziolotta avatar eziolotta commented on June 2, 2024

First experiment of segmentation of short audio, using rVADfast and an algorithm that analyze segments found by rVAD to generate a new sequence of speech segments.
rVAD (and same other) tend to cut last bit signal of a speech segment.
Code and other tests yet to be published.

Input Clip : 644_2532_000000.wav - 15 second - (MLS Dataset)
Output : 5 Speech Segments (wav files)

test_segmentation_short_audio.zip

i try to extend algo to long audio (maybe hour, try Public Podcast )

from deepspeech-italian-model.

eziolotta avatar eziolotta commented on June 2, 2024

Continuing the experiments with rVADFast, I was able to segment one random Podcast of Emilia Romagna Region

https://ambiente.regione.emilia-romagna.it/it/gallery/video/i-video-di-ermesambiente/convegno-inspire/stefano-olivucci-regione-emilia-romagna

Obtaining 143 segments with a duration from a minimum of 2 seconds to a maximum of 2 minutes.
Execution time for this process was approximately 1.5 hours

Audios are without transcription, so in this case an automatic transcription and human validation must be applied.

Unfortunately, other Speakers are also involved in podcasts, and some time words are not clear, check is required during validation. There is no background noise in Podcasts and the audio is clean.

Other Podcast here
Licence: Creative Commons Attribution 4.0

Output Dataset of My experiment can be downloaded here:
http://t.ly/xHHL

from deepspeech-italian-model.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.