I am looking for some much needed help and advice in regards to speech recognition, and I'm hoping you can point me in the right direction.
I'm currently waiting for my EZ robot kit to arrive, so in the meantime I have been familiarising myself with the EZ builder software and have been playing around with the speech recognition and Pandorabots options. EZ builder is currently installed on a Windows 7 64 bit laptop and I'm currently using a headset microphone, not a great quality one I admit, but it does work.
So after training the Windows speech recognition program a few times, and adjusting the microphone settings, I have been trying to use it with MS Word, Notepad, EZ builder with my pandorabot and the pandorabots website on its own, and I have come to the conclusion that Windows speech recognition is, well, "pants" to put it politely (unless I'm missing something). In a quiet room with no background noise, and speaking with a clear English accent, it only seems to pick up 40 to 50% of the correct responses of what I'm saying, compared to the 95% on getting on my iPhone, which is getting really frustrating now.
So would like to ask...
1.) Is there a better way of training Windows speech recognition?
2.) Can anyone suggest a decent well priced microphone to use?
3.) Is there any better speech recognition software I could use which will work with EZ builder?
4.) And finally, what sort of set up do you guys use that works well with your projects that use speech rec?
I really need to nail this, as speech recognition will play a big part of interacting with my robot when he is finished. So any help, thoughts, or suggestions you guys can offer really would be appreciated, and I thank you in advance.
Upgrade to ARC Pro
Experience early access to the latest features and updates. You'll have everything that is needed to unleash your robot's potential.
What I mean by "open dictionary" is using the entire English dictionary. Where, if you were to use ARC's Speech Recognition Control, it only detects phrases that you have defined.
An open dictionary has a kabillion words, and therefore increases the chance of false positives by a kabillion times
Use a headset
This is the link that explains how to use the PandoraBot control: http://www.ez-robot.com/Tutorials/Help.aspx?id=189
It also explains how important it is to use a headset
Yes I am using a headset at the moment as I am aware that using one is the best option for voice rec. I have played with numerous microphone settings but it just does not seem to make much difference.
As I mentioned before, it is not the best quality headset in the world but I should be getting better results than what I'm getting now. I just wondered if there is a better way to train the v/rec apart from reading the standard set up phrases in the control panel over and over? For example, one word it won't understand is "joke". It seems to think I'm saying "john" or "job" and I keep correcting it, but after a lot of corrections it still won't understand the joke (Maybe once out of about 30 attempts it will get it).
I knew controlling the EZ-B was possible through pandora after reading what DJ wrote saying about imbedding EZ script in to pandora response code (as DJ linked above). I just wondered if someone could suggest a good quality mic, better v/rec software or a better v/rec training process? Apparently the more Windows v/rec is used the better it learns but i've not seen evidence of this just yet, and it's getting a little frustrating.
As I mentioned in my first post, using my iPhone a/rec the results are about 95% accurate so when DJ and his crew release the iOS side of things, that is what I will probably end up using, as I have had great success using iOS speech recognition and a chatbot app on another project I did a while back. I can hold the iPhone at arms length and it is still over 90% accurate, and I have REALLY long arms
But for now it's Windows s/rec I need to use for for my project. What do you guys use on your projects that use speech recognition which you have had success with?
Train it more
Use a better mic
I've been training Jarvis for years now, once a day, three or four times a week, run the training. I believe the more you use the Windows SAPI the better it gets too, I read somewhere that it is always "training" even when just listening, how true that is I don't know though.
I get 80% positive results when using the built in mic on my webcam 3 meters from where I am. I get 95% positive results on a bluetooth headset.
Here are some more tips
It is true though, Windows SAPI is poor when compared to iOS, google or DNS.
When you say about training, do you mean going through the "Windows voice training" repeatedly over time, or actually correcting the words using the "Correct (word that is wrong)" command until it gets it?
I have look through the nuance website reading about the Dragon speech recognition software. Has anybody used this and would it work with ARC? It does seem a little bit pricy to me, but it also does seem to get good user reviews and comes complete with a headset.
@Toymaker uses DNS11 too, I don't know if he has it "plugged in" to ARC or not though.
I don't know if I edited soon enough but I linked to a post I wrote some time ago with some tips in it. Correcting words and training etc. All of that helps. Also, use the correct profile language - in our case English(GB). I know there was an issue with ARC a long time ago with any profile that wasn't English(US) but I think this has been fixed.
As for mics. I have a turtle beach BT headset which is made for the PS3 but with some messing around it works on the PC. It also pairs to 2 devices at once, so you could pair to your phone too and if the phone kicks in it auto switches over which was a cool feature I thought. It works pretty well however there are better.
Tony (@Toymaker) has suggested some mics in the past but they were very expensive, I can't remember what they were now since when I saw the price tag (which to be fair is probably around the same amount I spent on testing out cheaper alternatives) I dismissed it. He will likely chime in if he sees this.
Basically, you get what you pay for. A cheap mic wont be as good as an expensive one. It's most likely going to be a case of checking reviews. Another place to check are the Windows SAPI focussed forums on other sites. Vox Commando's forums probably has some good advice too (it uses Windows SAPI - it's what I use for a lot of Jarvis' functions because it works better than ARC in different areas i.e. can use payload files, can look up from databases from media players such as XBMC)
With the xtag and DNS11 I get around 98% in dictation mode (saying anything) and almost 100% in fixed grammar mode.
I have today loaded the new DNS13 on my i5 tablet which does not have my DNS voice profile on so its a clean install and without any training I am dictating (saying anything) at almost 99% accuracy, that's pretty impressive!
As Rich said you get what you pay for in Speech recognition.
Unfortunately (as far as I know) you cannot use Dragon with EZ-Robot, but this would be a fabulous addition if DJ could do it. I know its expensive but the ability to say almost anything (without pre-loaded grammar files) opens all sorts of possibilities especially with things like chatbots etc.
I know ARC and Windows has much better accuracy with a nice headset likeTony's pocket mic he linked to. However I passionately wanted to be able to walk into a room and just speak to my robot. I have scripted sentences I've written into the voice recognition control that trigger already recorded sound files and commands that trigger other scripts and functions on my B9. I use the Blue Snowball mic with great success. I don't know if others have tried it as each time I bring it up I get no feedback on it. I can set it up anywhere in the room where the robot is, plug it into the laptop through USB and it seems to hear me wherever I am and others can talk to him also. It has a unique three-pattern switch (cardioid, cardioid with -10dB pad and omni) for different listening patterns. Of course if there are a bunch of people talking and making noise I get some false triggers and accuracy drops off the cliff. I can always just place the speech recognition on pause if it gets irritating or If I want to communicate in those conditions I can use Tony's pocket mic.
Blue Snowball Mic
*eyeroll*. That did make for some good reading. Very nice job with that tutorial mate. Very easy to understand and well laid out. Well done. I have not had a chance to have a look at Vox Commando's forum yet, but I will have a look a little later.
@ Toymaker. I do like the look of the xTag but its a bit out of my price range for the moment, but like yourself and Rich said, You get what you pay for, and this does sound like a serious bit of kit. I agree that Dragon is a bit on the pricy side, but from the reviews I have read elsewhere it would be something I would seriously consider getting, IF that is, it would work with EZ Robot hard/software. @ JD. If your reading this buddy, Is having Dragon connect to ARC on one of your possible "to do" lists? It sounds like that it would be a popular, welcome and useful addition to the EZ platform.
@ Dave. I'm completely with you with your statement, " However I passionately wanted to be able to walk into a room and just speak to my robot.". I would love to be able to do the same as that really would be a neat feature, especially in regards (like you mentioned) to have other people chat to K-9 without having to hold a mic. But for now I would just be happy enough for my speech recognition program to understand me (walk before I can run kind of thing). I like the look of the Blue Snowball mic you mentioned. It's a shame that nobody left you any feedback, but from what I have seen elsewhere on the interweb, It does look reasonably priced and seems to have good reviews from other users too.
Back in Dragon Dictate v8 or so, they used SAPI, so anyone could integrate, but in either 9 or 10 they wen proprietary, reportedly because SAPI couldn't meet their needs and Microsoft was starting to compete with them in the enterprise voice recognition space, so there was some animosity.
I had a good look through your speech recognition tutorial and adjusted a few settings on my system and it has made a bit of a difference. I will definitely be investing in a new mic at some point, but as you say, it's all about the training so I'm doing as you suggested and doing a bit of training for an hour or 2 a few days a week. I just hope it makes a vast improvement to what it is now.
Thanks for your help, and to everyone else for your input aswell.