Comments (18)
Hi @belzedaar,
Your issue is the one issue people face when trying to make an application from pocketsphinx (be it pocketsphinx or pocketsphinx.js). To my knowledge, there is not one way to do it, but your attempt is similar to what I would try. There is one thing you need to add, though, which is proper transition probabilities. Play with the filler probabilities to get descent rejection rates. To add probabilities, use the logp
attribute in transitions. It is a log probability which defaults to 0 (i.e probability=1). For instance:
{from: 0, to: 0, logp: -5, word: "GARB_AA"},
{from: 0, to: 0, logp: -5, word: "GARB_AE"},
...
In any case, if you want to build a serious application, you should probably start with building a testing corpus that you can use to measure recognition performance (word error rate, false acceptance, rejection).
And another advise would be to go on IRC, on #cmusphinx, there are quite a lot of people (including myself, as sylvainc), always helpful (not including myself).
I hope that helps,
Sylvain
from pocketsphinx.js.
+1
We will soon introduce a proper garbage word, but not there yet
For a single keyword spotting there is a keyword spotting search which should work pretty reliably. It's probably not exposed in pocketsphinx.js yet but it coul be exposed.
from pocketsphinx.js.
Hi @nshmyrev,
Thanks a lot, this is great to see all these additions to pocketsphinx. I'll update the sources and try to expose it to JavaScript. kws should be typically useful in web applications.
Sylvain
from pocketsphinx.js.
Thanks @syl22-00, I will try tweaking the probabilities. I did think about that
but I figured the garbage elements may well be more common than actual elements of my grammar,
so I wasn't sure how to bias the probabilities.
@nshmyrev - thanks for the info. I look forward to trying out the garbage word support
when it arrives.
While hunting around last night I found https://github.com/latentflip/hark/
which provides a simple callback system for knowing when there's at least enough
volume to consider doing recognition. Hooking that up to start and stop recognition events
weeds out very quiet noises generating false positives. I still haven't quantified exactly
how much this chops off the first part of the sound, but it should be possible
to keep the audio from before the event and feed it to the recognizer on start.
If I come up with something nice I'll submit it as a patch.
Thanks
from pocketsphinx.js.
@belzedaar @nshmyrev , keyword spotting from PocketSphinx is now available/accessible in pocketsphinx.js!
from pocketsphinx.js.
Great! I'll give it a try this weekend.
Having a look at the the API, It seems you can only have 1 keyword search active at a time?
I guess you could have multiple recognizer objects fed with the same audio stream but that seems
a bit excessive.
from pocketsphinx.js.
You're right @belzedaar , there can be only one phrase to be spotted. If you want to spot a bunch of words and only need to know that any of them was spotted, you can just create one word with pronunciation alternatives with all words you want to spot.
In addition, you might want to adjust the threshold that determines sensitivity, I found that the default value (1) was a little strict.
from pocketsphinx.js.
Apologies in advance if the question is stupid but would it be feasible to instead of running many parallel keyword spotting instances use this approach?
- Have one instance that recognizes a keyword out of a pool of keywords (say 20)
- In parallel run a second instance using a grammar with just those keywords
- Count increments from first instance would then be used to trigger extraction of the last keyword out of the limited grammar instance
from pocketsphinx.js.
@ignjat this is indeed tempting but recognizer instanciations are actually very costly in memory. Meanwhile it seems like the sphinx folks have done a lot of improvements to keyword spotting, including the capability to add more than one keyword or keyphrase. I'll try to syncup with upstream as soon as possible.
from pocketsphinx.js.
Dear Colleagues,
Can you kindly guide me with this problem... how to solve this issue.. Unfortunately i couldnt find a solution to implement Garbage model or confidence score on pocketsphinx..!!
The problem is here
http://sourceforge.net/p/cmusphinx/discussion/help/thread/9a19df7a/
from pocketsphinx.js.
Is the ability to assign different handlers for different keywords in?
from pocketsphinx.js.
@ignjat I recently talked about it with the developers on #cmusphinx, and it is unfortunately still not available from the C API, only by loading a file from the command line. So right now it is not tested, but might be easily doable if you are ready to spend a little bit of time to figure it out (it should be similar than packaging a custom dictionary or language model file and load it with the -kws command line parameter).
Once it is available from the C API, I would add it in the JavaScript API.
from pocketsphinx.js.
@syl22-00 I have been trying to add the logp factor to the grammar in the demo given by PocketSphinx.js. The results seems to be same. Even if noise is passed on to the mic it pops out words from the grammar.
I am pasting the changes I made in the cities grammar
var grammarCities = {numStates: 1, start: 0, end: 0, transitions: [{from: 0, to: 0, logp: -5, word: "HELLO"},{from: 0, to: 0, logp: -5, word: "WORLD"}]};
It will be grateful if the if I can have ways to reject words that are not in the grammar variable.
Even the Live Demo has the same issues. If you just blow air to the Mic in Live Demo it will result in Paris if cities dropdown in selected.
Please let me know if there is a functionality already available to do this.
Thanks in Advance.
from pocketsphinx.js.
@chand3040 the probability changes will not have any effect if you add the same probability for all transitions, they will be normalized.
Say you have 3 transitions going out of state 0, by default each of them is equally probable (p=0.33). If you set them to 0.5, 0.25, and 0.25 (then converted in log), the recognizer will be more likely to pick the first transition.
So this should be used after you add additional transitions with filler words that will be there to catch noise and words out of vocabulary/grammar.
Remember that the recognizer will always output a recognized string (hyp), no matter how well it matches, so your language model (here the grammar) must offer something that will be likely to be chosen in case of noise or OOV.
from pocketsphinx.js.
@syl22-00 Thanks for quick reply.
Can you please explain with an example how to set OOV.Though I was able to understand the logic of log.
It would be greatful to have an example for the same.
Thanks again.
from pocketsphinx.js.
@syl22-00 One more quick question from my side. How can I get the confidence level of the result that gets detected.
Thanks in Advance.
from pocketsphinx.js.
@chand3040 for the garbage or filler words and transitions, I think everything is pretty much explained in this thread:
- Create filler words, similar to the
GARB_...
described above. - Add transitions to your grammar that contain these words. These transitions can be looping on one state, and/or in parallel with transitions from your original grammar.
- Set probabilities so that you don't have too many false positives (garbage being recognized as a word from the grammar) or false negatives (words from the grammar that are caught as fillers).
- experiment several strategies for adding these transitions and setting these probabilities so that you minimize the FA and FR rates.
And unfortunately, right now, there is no such thing as a reliable confidence score with grammars in pocketsphinx (otherwise you would probably not need to do these things with fillers), but the sphinx developers are working on it, so you might want to follow the upstream cmusphinx projects, and on IRC (#cmusphinx)
I hope these help.
from pocketsphinx.js.
@syl22-00 I gone through this thread but since I am newbie to speech recognition your good ideas are bouncers to me.
So I request you to help me with a simple small example that recognize the word _"Goodbye"_ that achieves good accuracy.
I am eager to know which GARBAGE words you include and how you distribute the probabilities (logp) and how you give transitions.
It will be very beneficial for me and others who are new to speech recognition.
And now I am aware of how to compile and produce pocketsphinx.js so if any changes in any file in source which can increase the accuracy then please suggest me.
Thank You in advance....
from pocketsphinx.js.
Related Issues (20)
- lazyLoading, Module.FS_createPath not found (emscripten compilation without accoustic model) HOT 1
- some time when click on start i am getting Cannot change search while decoding, end utterance first HOT 1
- FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory HOT 2
- Words.push_back accept 1 arguments only HOT 1
- Trying to build but submodules fail HOT 2
- BindingError with self-compiled pocketsphinx.js HOT 2
- Change threshold to increase accuracy detection HOT 2
- in web form asp.net? HOT 2
- Can't install with npm HOT 1
- Webapp.js file is missing HOT 1
- Buffer in recognizer uses obsolete data on multiple "process"-requests
- issue with make
- Web Worker Lazy Load Command - Outgoing Message is Empty HOT 1
- Compilation Failed In Custom Acoustic Model
- Grammar for numbers? HOT 5
- Pocketsphinx within web worker
- live.html demo no longer works in Chrome, only Firefox
- Compilation into WebAssembly fails HOT 1
- Sorry if I laugh HOT 1
- Local host not working with chrome
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pocketsphinx.js.