Code Monkey home page Code Monkey logo

Comments (18)

syl22-00 avatar syl22-00 commented on September 13, 2024

Hi @belzedaar,

Your issue is the one issue people face when trying to make an application from pocketsphinx (be it pocketsphinx or pocketsphinx.js). To my knowledge, there is not one way to do it, but your attempt is similar to what I would try. There is one thing you need to add, though, which is proper transition probabilities. Play with the filler probabilities to get descent rejection rates. To add probabilities, use the logp attribute in transitions. It is a log probability which defaults to 0 (i.e probability=1). For instance:

{from: 0, to: 0, logp: -5, word: "GARB_AA"},
{from: 0, to: 0, logp: -5, word: "GARB_AE"},
...

In any case, if you want to build a serious application, you should probably start with building a testing corpus that you can use to measure recognition performance (word error rate, false acceptance, rejection).

And another advise would be to go on IRC, on #cmusphinx, there are quite a lot of people (including myself, as sylvainc), always helpful (not including myself).

I hope that helps,

Sylvain

from pocketsphinx.js.

nshmyrev avatar nshmyrev commented on September 13, 2024

+1

We will soon introduce a proper garbage word, but not there yet

For a single keyword spotting there is a keyword spotting search which should work pretty reliably. It's probably not exposed in pocketsphinx.js yet but it coul be exposed.

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

Hi @nshmyrev,

Thanks a lot, this is great to see all these additions to pocketsphinx. I'll update the sources and try to expose it to JavaScript. kws should be typically useful in web applications.

Sylvain

from pocketsphinx.js.

belzedaar avatar belzedaar commented on September 13, 2024

Thanks @syl22-00, I will try tweaking the probabilities. I did think about that
but I figured the garbage elements may well be more common than actual elements of my grammar,
so I wasn't sure how to bias the probabilities.

@nshmyrev - thanks for the info. I look forward to trying out the garbage word support
when it arrives.

While hunting around last night I found https://github.com/latentflip/hark/
which provides a simple callback system for knowing when there's at least enough
volume to consider doing recognition. Hooking that up to start and stop recognition events
weeds out very quiet noises generating false positives. I still haven't quantified exactly
how much this chops off the first part of the sound, but it should be possible
to keep the audio from before the event and feed it to the recognizer on start.
If I come up with something nice I'll submit it as a patch.

Thanks

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

@belzedaar @nshmyrev , keyword spotting from PocketSphinx is now available/accessible in pocketsphinx.js!

from pocketsphinx.js.

belzedaar avatar belzedaar commented on September 13, 2024

Great! I'll give it a try this weekend.

Having a look at the the API, It seems you can only have 1 keyword search active at a time?
I guess you could have multiple recognizer objects fed with the same audio stream but that seems
a bit excessive.

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

You're right @belzedaar , there can be only one phrase to be spotted. If you want to spot a bunch of words and only need to know that any of them was spotted, you can just create one word with pronunciation alternatives with all words you want to spot.

In addition, you might want to adjust the threshold that determines sensitivity, I found that the default value (1) was a little strict.

from pocketsphinx.js.

ignjat avatar ignjat commented on September 13, 2024

Apologies in advance if the question is stupid but would it be feasible to instead of running many parallel keyword spotting instances use this approach?

  1. Have one instance that recognizes a keyword out of a pool of keywords (say 20)
  2. In parallel run a second instance using a grammar with just those keywords
  3. Count increments from first instance would then be used to trigger extraction of the last keyword out of the limited grammar instance

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

@ignjat this is indeed tempting but recognizer instanciations are actually very costly in memory. Meanwhile it seems like the sphinx folks have done a lot of improvements to keyword spotting, including the capability to add more than one keyword or keyphrase. I'll try to syncup with upstream as soon as possible.

from pocketsphinx.js.

bsnayak avatar bsnayak commented on September 13, 2024

Dear Colleagues,

Can you kindly guide me with this problem... how to solve this issue.. Unfortunately i couldnt find a solution to implement Garbage model or confidence score on pocketsphinx..!!

The problem is here

http://sourceforge.net/p/cmusphinx/discussion/help/thread/9a19df7a/

from pocketsphinx.js.

ignjat avatar ignjat commented on September 13, 2024

Is the ability to assign different handlers for different keywords in?

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

@ignjat I recently talked about it with the developers on #cmusphinx, and it is unfortunately still not available from the C API, only by loading a file from the command line. So right now it is not tested, but might be easily doable if you are ready to spend a little bit of time to figure it out (it should be similar than packaging a custom dictionary or language model file and load it with the -kws command line parameter).

Once it is available from the C API, I would add it in the JavaScript API.

from pocketsphinx.js.

chand3040 avatar chand3040 commented on September 13, 2024

@syl22-00 I have been trying to add the logp factor to the grammar in the demo given by PocketSphinx.js. The results seems to be same. Even if noise is passed on to the mic it pops out words from the grammar.
I am pasting the changes I made in the cities grammar
var grammarCities = {numStates: 1, start: 0, end: 0, transitions: [{from: 0, to: 0, logp: -5, word: "HELLO"},{from: 0, to: 0, logp: -5, word: "WORLD"}]};

It will be grateful if the if I can have ways to reject words that are not in the grammar variable.
Even the Live Demo has the same issues. If you just blow air to the Mic in Live Demo it will result in Paris if cities dropdown in selected.
Please let me know if there is a functionality already available to do this.
Thanks in Advance.

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

@chand3040 the probability changes will not have any effect if you add the same probability for all transitions, they will be normalized.

Say you have 3 transitions going out of state 0, by default each of them is equally probable (p=0.33). If you set them to 0.5, 0.25, and 0.25 (then converted in log), the recognizer will be more likely to pick the first transition.

So this should be used after you add additional transitions with filler words that will be there to catch noise and words out of vocabulary/grammar.

Remember that the recognizer will always output a recognized string (hyp), no matter how well it matches, so your language model (here the grammar) must offer something that will be likely to be chosen in case of noise or OOV.

from pocketsphinx.js.

chand3040 avatar chand3040 commented on September 13, 2024

@syl22-00 Thanks for quick reply.
Can you please explain with an example how to set OOV.Though I was able to understand the logic of log.
It would be greatful to have an example for the same.
Thanks again.

from pocketsphinx.js.

chand3040 avatar chand3040 commented on September 13, 2024

@syl22-00 One more quick question from my side. How can I get the confidence level of the result that gets detected.
Thanks in Advance.

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on September 13, 2024

@chand3040 for the garbage or filler words and transitions, I think everything is pretty much explained in this thread:

  • Create filler words, similar to the GARB_... described above.
  • Add transitions to your grammar that contain these words. These transitions can be looping on one state, and/or in parallel with transitions from your original grammar.
  • Set probabilities so that you don't have too many false positives (garbage being recognized as a word from the grammar) or false negatives (words from the grammar that are caught as fillers).
  • experiment several strategies for adding these transitions and setting these probabilities so that you minimize the FA and FR rates.

And unfortunately, right now, there is no such thing as a reliable confidence score with grammars in pocketsphinx (otherwise you would probably not need to do these things with fillers), but the sphinx developers are working on it, so you might want to follow the upstream cmusphinx projects, and on IRC (#cmusphinx)

I hope these help.

from pocketsphinx.js.

naresh21 avatar naresh21 commented on September 13, 2024

@syl22-00 I gone through this thread but since I am newbie to speech recognition your good ideas are bouncers to me.
So I request you to help me with a simple small example that recognize the word _"Goodbye"_ that achieves good accuracy.
I am eager to know which GARBAGE words you include and how you distribute the probabilities (logp) and how you give transitions.
It will be very beneficial for me and others who are new to speech recognition.

And now I am aware of how to compile and produce pocketsphinx.js so if any changes in any file in source which can increase the accuracy then please suggest me.
Thank You in advance....

from pocketsphinx.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.