Code Monkey home page Code Monkey logo

libfvad's People

Contributors

afedor avatar agouaillard avatar arlolra avatar dpirch avatar ehlemur-zz avatar eugenebas avatar fippo avatar floppym avatar frogg avatar ggarber avatar hekra01 avatar jonnor avatar korniltsev avatar maojie avatar max-potapov avatar meetakshay99 avatar mirkobonadei avatar mstyura avatar mzanaty avatar niklasenbom avatar oprypin avatar pasko avatar philipel-webrtc avatar pkasting avatar pthatcherg avatar s3rj1k avatar sdkdimon avatar sorenskak avatar steweg avatar tkchin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libfvad's Issues

error of using sndfile

HI,
Thanks for you works,
when I tried to compile the code,
the error occured:
./configure: line 12008: syntax error near unexpected token sndfile,' ./configure: line 12008: PKG_CHECK_MODULES(sndfile, sndfile)'

I have been install libsndfile1-dev already, but still happen,
this will cause the example can not be compiled as well,
Can you tell me how to fix this bug

Regards
Robin

Does not take into account bit depth and channel numbers

To measure how many bytes are in x milliseconds for an audio sample you must take into account the bit depth and number of channels. For example, to find n bytes in x milliseconds this is what I do

long bytes_per_second(int sample_rate, int8_t bit_depth, int8_t channels) {
    auto byte_depth = bit_depth / 8;

    return sample_rate * channels * byte_depth;
}

auto bytes_per_second = bytes_per_second(sample_rate, bit_depth, channels);
auto bytes_per_millisecond = bps / 1000;
auto bytes_per_chunk = bytes_per_millisecond * // 10, 20 or 30 milliseconds;

For 10 milliseconds of bytes of an 8000 sample rate this could be 80 bytes or 160 bytes for 16 bits or 160 bytes for 8 bits but 2 channels etc. Currently fvad_process only accepts 80 bytes for 8000 and 10 milliseconds. Does the WebRTC vad place these limitations?

Also, would it hurt the WebRTC's vad accuracy if I gave it bytes within the in-between range of 10 - 20 - 30 milliseconds? Like 11 or 24 milliseconds worth of bytes? This is a problem for me because I end up with left over bytes that don't fit neatly into 10,20,30.

So I was hoping I could redistribute the byte sets to include 1 more byte per chunk that will help use up the remainder bytes. Hope this makes sense given 18 bytes / 4 chunk_size = 4 bytes per chunk with remainder of 2 bytes solution could be to use 5 bytes per chunk for the first 10 bytes and then 4 byte chunks for the remaining 8 bytes

how to use the vad ?

hello,

After I build the libfvad,
I want use the VAD to cut a speech wav or raw ,
how to use ?

thx

C++ only works when installed from repo not from release

Hey there!

Trying to get this lib to work for C++ was giving me headaches until I discovered this pull request (Add support against linkage with C++ programs ), and your comment that you added the support with this commit.

I downloaded the lib from the release page which is quite outdated and this fix was not included in that release. Cloning and building solved my problem and the header then included the code I needed.

So either advise people to build directly from master or make a new release.

Many thanks :)

How do I use it in unimrcp

I try to recode unimrcp mpf_activity_detector, and use it to replace the vad mod
but it comes error like:
../../platforms/libunimrcp-client/.libs/libunimrcpclient.so: undefined reference to fvad_new' ../../platforms/libunimrcp-client/.libs/libunimrcpclient.so: undefined reference to fvad_reset'
../../platforms/libunimrcp-client/.libs/libunimrcpclient.so: undefined reference to `fvad_process'

modules/audio_processing/vad porting

Thanks for the good project!

I'm new to VAD. Seems like this folder modules/audio_processing/vad also includes some vad related codes. is it? if yes, are you planning to port that part too?

Also wonder to know if you want to write some python binding examples or not. People can easily manipulate this library. :-)

"-std=c11" is required

While README.md states:

Recommended CFLAGS to turn on warnings: -std=c11 -Wall -Wextra -Wpedantic

It is actually required to use "-std=c11" for the compiled library to work, otherwise problems will arise either when compiling or importing. Maybe the flag should be added to the makefiles.

Tested with GCC 4.8.5: if no "-std" flags are specified, compilation fails with

fvad.c:75:5: error: ‘for’ loop initial declarations are only allowed in C99 mode
     for (size_t i = 0; i < arraysize(valid_rates); i++) {
     ^
fvad.c:75:5: note: use option -std=c99 or -std=gnu99 to compile your code

and if "-std=c99" is specified, if the warnings are turned on, many warnings can be seen:

In file included from signal_processing/signal_processing_library.h:34:0,
                 from signal_processing/get_scaling_square.c:18:
signal_processing/spl_inl.h: In function ‘WebRtcSpl_CountLeadingZeros32’:
signal_processing/spl_inl.h:42:3: warning: implicit declaration of function ‘static_assert’ [-Wimplicit-function-declaration]
   RTC_COMPILE_ASSERT(sizeof(unsigned int) == sizeof(uint32_t));
   ^

using the compiled library will raise an error similar to "undefined symbol: static_assert".

Facing issue while Building libfvad on Ubuntu 22.04.1 LTS

unable to install on libfvad on Ubuntu 22.04.1 LTS while doing sudo autoreconf -i . i am facing error as below:

libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'ac-aux'.
libtoolize: copying file 'ac-aux/ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4'
configure.ac:16: error: possibly undefined macro: _AC_C_STD_TRY
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.ac:22: error: possibly undefined macro: AC_MSG_ERROR
autoreconf: error: /usr/bin/autoconf failed with exit status: 1

Please help me to fix these issue

Can't rebuild examples/fvadwav.c

After building and installing I try:

cp examples/fvadwav.c ~/tmp_proj/
g++ -v -g fvadwav.c

and get an error:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 8.3.0-6ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1~18.04.1) 
COLLECT_GCC_OPTIONS='-v' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/8/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE fvadwav.c -quiet -dumpbase fvadwav.c -mtune=generic -march=x86-64 -auxbase fvadwav -g -version -fstack-protector-strong -Wformat -Wformat-security -o /tmp/ccxmJrHK.s
GNU C++14 (Ubuntu 8.3.0-6ubuntu1~18.04.1) version 8.3.0 (x86_64-linux-gnu)
	compiled by GNU C version 8.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.19-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/8"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/8
 /usr/include/x86_64-linux-gnu/c++/8
 /usr/include/c++/8/backward
 /usr/lib/gcc/x86_64-linux-gnu/8/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C++14 (Ubuntu 8.3.0-6ubuntu1~18.04.1) version 8.3.0 (x86_64-linux-gnu)
	compiled by GNU C version 8.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.19-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 27ae9a20c27efba91196488dcf7713bb
COLLECT_GCC_OPTIONS='-v' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 as -v --64 -o /tmp/ccNkTGt2.o /tmp/ccxmJrHK.s
GNU ассемблер, версия 2.30 (x86_64-linux-gnu); используется BFD версии (GNU Binutils for Ubuntu) 2.30
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/8/:/usr/lib/gcc/x86_64-linux-gnu/8/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/8/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/8/:/usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/8/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/8/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-g' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/8/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/8/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper -plugin-opt=-fresolution=/tmp/cc5K4mhk.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/8/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/8 -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/8/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/8/../../.. /tmp/ccNkTGt2.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/8/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crtn.o
/tmp/ccNkTGt2.o: In function `process_sf(SNDFILE_tag*, Fvad*, unsigned long, SNDFILE_tag**, _IO_FILE*)':
/home/t4nner/proj/learning/vad/fvadwav.c:38: undefined reference to `sf_read_double'
/home/t4nner/proj/learning/vad/fvadwav.c:44: undefined reference to `fvad_process'
/home/t4nner/proj/learning/vad/fvadwav.c:57: undefined reference to `sf_write_double'
/tmp/ccNkTGt2.o: In function `main':
/home/t4nner/proj/learning/vad/fvadwav.c:114: undefined reference to `fvad_new'
/home/t4nner/proj/learning/vad/fvadwav.c:126: undefined reference to `fvad_set_mode'
/home/t4nner/proj/learning/vad/fvadwav.c:179: undefined reference to `sf_open'
/home/t4nner/proj/learning/vad/fvadwav.c:181: undefined reference to `sf_strerror'
/home/t4nner/proj/learning/vad/fvadwav.c:190: undefined reference to `fvad_set_sample_rate'
/home/t4nner/proj/learning/vad/fvadwav.c:205: undefined reference to `sf_open'
/home/t4nner/proj/learning/vad/fvadwav.c:207: undefined reference to `sf_strerror'
/home/t4nner/proj/learning/vad/fvadwav.c:242: undefined reference to `sf_close'
/home/t4nner/proj/learning/vad/fvadwav.c:244: undefined reference to `sf_close'
/home/t4nner/proj/learning/vad/fvadwav.c:246: undefined reference to `fvad_free'
collect2: error: ld returned 1 exit status

my /usr/include is:

➜  vad ll /usr/include | grep sndfile  
-rw-r--r--   1 root root  29K июн  8  2019 sndfile.h
-rw-r--r--   1 root root  13K июн  8  2019 sndfile.hh

/usr/local/include:

➜  vad ll -t /usr/local/include | head -n 2
-rw-r--r-- 1 root root 2,6K янв 28 15:01 fvad.h

how can I reproduce your example code without errors?

Update

I found solution

g++ -v -g fvadwav.c -lsndfile -lfvad

Please, add it to description.

cut audio into chunks and extract start time and end time.

Hi,
I want to run libfvab on my own audio file and want to save audio chunks detected as voiced frames in ".wav" format. I want to extract start time and end time of each chunks.

Currently I am able to reproduce the the output "libfvad/tests/data/wavtest.expect". Now I can see it detects voiced and unvoiced frames out of audio file.

Thanks

how to build ?

hi,
When I git clone the project,
and then cd libfvad/
./configure
but tell me
-bash: ./configure: No such file or directory
of course, I have run
sudo apt install autoconf libtool pkg-config

SO what's up ?
how to solve the problem ?

I'm using unsigned char

My audio data is in the form of unsigned char* arrays. fvad_process takes a signed short. Do I need to just convert from char to short? Will there be a loss of correctness as far as the vad is concerned?

uality benchmarks between audiotok / webrtcvad / silero-vad

Instruments

We have compared 3 easy-to-use off-the-shelf instruments for voice activity / audio activity detection:

Caveats

  • Full disclaimer - we are mostly interested in voice detection, not just silence detection;
  • In our extensive experiments we noticed that WebRTC is actually much better in detecting silence than detecting speech (probably by design). It has a lot of false positives when detecting speech;
  • audiotok provides Audio Activity Detection, which probably may just mean detecting silence in layman's terms;
  • silero-vad is geared towards speech detection (as opposed to noise or music);
  • A sensible chunk size for our VAD is at least 75-100ms (pauses in speech shorter than 100ms are not very meaningful, but we prefer 150-250ms chunks, see quality comparison here), while audiotok and webrtcvad use 30-50ms chunks (we used default values of 30 ms for webrtcvad and 50 ms for audiotok );
  • We have excluded pyannote-audio for now (https://github.com/pyannote/pyannote-audio), since it features pre-trained models on only limited academic datasets and is mostly a recipe collection / toolkit to build your own tools, not a finished tool per se (also for such a simple task the amount of code bloat is puzzling from a production standpoint, our internal vad training code is just literally 5 python modules);

Methodology

Please refer here - https://github.com/snakers4/silero-vad#vad-quality-metrics-methodology

Quality Benchmarks

Finished tests:

image

Portability and Speed

  • Looks like originally webrtcvad is written in С++ around 2016, so theoretically it can be ported into many platforms;
  • I have inquired in the community, the original VAD seems to have matured and python version is based on 2018 version;
  • Looks like audiotok is written in plain python, but I guess the algorithm itself can be ported;
  • silero-vad is based on PyTorch and ONNX, so it boasts the same portability options both these frameworks feature (mobile, different backends for ONNX, java and C++ inference APIs, graph conversion from ONNX);

This is by no means an extensive and full research on the topic, please point out if anything is lacking.

MinGW build

I'm trying to build libfvad in MS Windows (i.e. using MinGW) and was getting strange errors, but I think I have them resolved. If I get it to run, I'll issue a pull request.

Information request

Hi,
I'm new to the libfvad library.
I wondering about the usage of this library on some embedded 32bit microcontroller Cortex M4 based (or ESP32 too), but I'm not able to find any kind of information about the memory requirements and the CPU power too.
Has anyone experienced that situation ?
Thank you.
Regards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.