
Hi everyone.
I am looking for some much needed help and advice in regards to speech recognition, and I'm hoping you can point me in the right direction.
I'm currently waiting for my EZ robot kit to arrive, so in the meantime I have been familiarising myself with the EZ builder software and have been playing around with the speech recognition and Pandorabots options. EZ builder is currently installed on a Windows 7 64 bit laptop and I'm currently using a headset microphone, not a great quality one I admit, but it does work.
So after training the Windows speech recognition program a few times, and adjusting the microphone settings, I have been trying to use it with MS Word, Notepad, EZ builder with my pandorabot and the pandorabots website on its own, and I have come to the conclusion that Windows speech recognition is, well, "pants" to put it politely (unless I'm missing something). In a quiet room with no background noise, and speaking with a clear English accent, it only seems to pick up 40 to 50% of the correct responses of what I'm saying, compared to the 95% on getting on my iPhone, which is getting really frustrating now.
So would like to ask...
1.) Is there a better way of training Windows speech recognition?
2.) Can anyone suggest a decent well priced microphone to use?
3.) Is there any better speech recognition software I could use which will work with EZ builder?
4.) And finally, what sort of set up do you guys use that works well with your projects that use speech rec?
I really need to nail this, as speech recognition will play a big part of interacting with my robot when he is finished. So any help, thoughts, or suggestions you guys can offer really would be appreciated, and I thank you in advance.
Cheers.
Steve.
Yeah I'm with you about a headset mic replacement rich. Maybe that is the issue I'm having. Any suggestions?
When you say about training, do you mean going through the "Windows voice training" repeatedly over time, or actually correcting the words using the "Correct (word that is wrong)" command until it gets it?
I have look through the nuance website reading about the Dragon speech recognition software. Has anybody used this and would it work with ARC? It does seem a little bit pricy to me, but it also does seem to get good user reviews and comes complete with a headset.
Steve.
I've used DNS11 but not with ARC. I believe you could link the two somehow though (although I haven't had chance to look in to that). DNS11 was awesome, 13 is now out (or soon to be out).
@Toymaker uses DNS11 too, I don't know if he has it "plugged in" to ARC or not though.
I don't know if I edited soon enough but I linked to a post I wrote some time ago with some tips in it. Correcting words and training etc. All of that helps. Also, use the correct profile language - in our case English(GB). I know there was an issue with ARC a long time ago with any profile that wasn't English(US) but I think this has been fixed.
As for mics. I have a turtle beach BT headset which is made for the PS3 but with some messing around it works on the PC. It also pairs to 2 devices at once, so you could pair to your phone too and if the phone kicks in it auto switches over which was a cool feature I thought. It works pretty well however there are better.
Tony (@Toymaker) has suggested some mics in the past but they were very expensive, I can't remember what they were now since when I saw the price tag (which to be fair is probably around the same amount I spent on testing out cheaper alternatives) I dismissed it. He will likely chime in if he sees this.
Basically, you get what you pay for. A cheap mic wont be as good as an expensive one. It's most likely going to be a case of checking reviews. Another place to check are the Windows SAPI focussed forums on other sites. Vox Commando's forums probably has some good advice too (it uses Windows SAPI - it's what I use for a lot of Jarvis' functions because it works better than ARC in different areas i.e. can use payload files, can look up from databases from media players such as XBMC)
I get outstanding performance and speech recognition accuracy with the Revolabs xtagwireless microphone www.revolabs.com/products/product-line/xtag-usb Its a professional mic and expensive. I wear it inside my shirt pocket and no one notices that I am even wearing a wireless mic.
With the xtag and DNS11 I get around 98% in dictation mode (saying anything) and almost 100% in fixed grammar mode.
I have today loaded the new DNS13 on my i5 tablet which does not have my DNS voice profile on so its a clean install and without any training I am dictating (saying anything) at almost 99% accuracy, that's pretty impressive!
As Rich said you get what you pay for in Speech recognition.
Unfortunately (as far as I know) you cannot use Dragon with EZ-Robot, but this would be a fabulous addition if DJ could do it. I know its expensive but the ability to say almost anything (without pre-loaded grammar files) opens all sorts of possibilities especially with things like chatbots etc.
Tony
Too bad someone cant make a plug in for DNS like Justin did with Face Recognition. That would be a unbelievable addition to ARC.
I know ARC and Windows has much better accuracy with a nice headset likeTony's pocket mic he linked to. However I passionately wanted to be able to walk into a room and just speak to my robot. I have scripted sentences I've written into the voice recognition control that trigger already recorded sound files and commands that trigger other scripts and functions on my B9. I use the Blue Snowball mic with great success. I don't know if others have tried it as each time I bring it up I get no feedback on it. I can set it up anywhere in the room where the robot is, plug it into the laptop through USB and it seems to hear me wherever I am and others can talk to him also. It has a unique three-pattern switch (cardioid, cardioid with -10dB pad and omni) for different listening patterns. Of course if there are a bunch of people talking and making noise I get some false triggers and accuracy drops off the cliff. I can always just place the speech recognition on pause if it gets irritating or If I want to communicate in those conditions I can use Tony's pocket mic.
Blue Snowball Mic
@ Rich. Thanks for your link. To be fair I think I missed it first time round reading your post eyeroll. That did make for some good reading. Very nice job with that tutorial mate. Very easy to understand and well laid out. Well done. I have not had a chance to have a look at Vox Commando's forum yet, but I will have a look a little later.
@ Toymaker. I do like the look of the xTag but its a bit out of my price range for the moment, but like yourself and Rich said, You get what you pay for, and this does sound like a serious bit of kit. I agree that Dragon is a bit on the pricy side, but from the reviews I have read elsewhere it would be something I would seriously consider getting, IF that is, it would work with EZ Robot hard/software. @ JD. If your reading this buddy, Is having Dragon connect to ARC on one of your possible "to do" lists? It sounds like that it would be a popular, welcome and useful addition to the EZ platform.
@ Dave. I'm completely with you with your statement, " However I passionately wanted to be able to walk into a room and just speak to my robot.". I would love to be able to do the same as that really would be a neat feature, especially in regards (like you mentioned) to have other people chat to K-9 without having to hold a mic. But for now I would just be happy enough for my speech recognition program to understand me (walk before I can run kind of thing). I like the look of the Blue Snowball mic you mentioned. It's a shame that nobody left you any feedback, but from what I have seen elsewhere on the interweb, It does look reasonably priced and seems to have good reviews from other users too.
Steve.
The problem with DNS integration is that their SDK costs a small fortune. They build in support for common Windows apps and the operating system, and I think anywhere you can type, it can fill in (so some integration may be doable, we just need to think it through a bit), but if you want to embed the functionality into your own app, you need their SDK which is very pricey.
Back in Dragon Dictate v8 or so, they used SAPI, so anyone could integrate, but in either 9 or 10 they wen proprietary, reportedly because SAPI couldn't meet their needs and Microsoft was starting to compete with them in the enterprise voice recognition space, so there was some animosity.
Alan
@ Rich
I had a good look through your speech recognition tutorial and adjusted a few settings on my system and it has made a bit of a difference. I will definitely be investing in a new mic at some point, but as you say, it's all about the training so I'm doing as you suggested and doing a bit of training for an hour or 2 a few days a week. I just hope it makes a vast improvement to what it is now.
Thanks for your help, and to everyone else for your input aswell.
Steve.
cool mic or sure old mic