Asked — Edited

Using Dragon Naturally Speaking With ARC

This has been brought up before is some threads, but has come up again recently. I promised that I would make a post about this so, here it goes...

There have been some posts where people have asked to be able to use Dragon Naturally Speaking (DNS) with ARC. To do this, there would have to be a separate application installed which would handle this unless there is a change to ARC itself to allow this to happen. To build an application outside of ARC to allow this would require some programming and a $2000.00 investment in the DNS developer software. To use this new tool, you would have to launch a separate application on your computer to handle speech recognition, which would pass the spoken text back to a variable in ARC. You would have to develop scripts in ARC to handle what to do with this spoken text. Another application would also need to be written that would allow the user to train DNS.

The idea of crowd funding has been brought up a few times in the past, but always in other threads that not everyone might have seen. I would like to suggest crowd funding of this idea again. It would be great if we could crowd fund the purchase of DNS Developer for EZ-Robot to add the ability to choose either DNS or Windows Speech Recognition engine directly within ARC. If that is not possible and there were enough people willing to crowd fund the purchase of DNS Developer, I would be willing to develop the 3rd party apps that could be used to handle what was described above.

This is to see how much interest there is in doing this. Please post if you would be interested in donating to this project or to donating to see if we can get this feature added directly into ARC.

One important thing to remember here. This path would only enable DNS for the applications above. It would not enable it on every application outside of these. In other words, you wouldn't have all of DNS on your machine, but only for use with ARC.


ARC Pro

Upgrade to ARC Pro

Take control of your robot's destiny by subscribing to Synthiam ARC Pro, and watch it evolve into a versatile and responsive machine.

United Kingdom
#9  

I just want to add to what I wrote in post #2.

It is great that David C is willing to take this on, and I cannot think of a better person for the job. His willingness to do do this, not just for his EZ-AI program, but for all of us as well, is just fantastic and I cannot thank you enough David. But whatever direction this takes, I would personally like to see some involvement from EZ-Robot with it's implementation with ARC (it is their platform after all).

DJ and the guys are striving to be the best robotics software company in the world and are currently doing a fantastic job in doing just that, which is clear to see. But to keep being the king of the castle in this field requires that the very best hardware and very best software is used, and in this case, using DNS falls in this category. And as Dave S has already said, Speech Recognition really is quite an important part of ARC and robot control in general. So I would like to think that the EZ-Robot team would be willing to take up the challenge of implementing DNS in to ARC using its members (our) donations to make this a reality and to keep ARC and EZ-Robots the best robotic software company there is, or even participate in David C's on take of this project if that's the direction this will take.

Either way, I would still be happy to donate cash and be a tester for this project whoever takes it on. My current financial situation doesn't leave me a lot of disposable income at this time, but I would be willing to donate £70 GBP (about $100 USD) to this, bearing in mind I will still need to pay out about $120 to purchase the DNS software itself, and if I can afford to pay a little more at the time of funding I would be more than happy to do just that to help make this a reality.

In any case, I am really excited about this project as I am more than a little fed up with Windows speech rec, even when I use a quality mic and lots of training. This can only be a good thing. And a quick mention to show my personal thanks to Dave S for his willingness to take care of the fund raising account.:)

Steve.

#10  

There would be no need for you to buy dragon. I will explain in a few minutes. On cell phone right now...

#11  

The reason that the DNS SDK is so much is that it allows the redistribution of the DSN components via the application that is developed. This prevents others from having to buy DNS to get these components. This is also why DNS would only work with the application that included these components (the third party app or ARC).

I would prefer that DJ took this on so that it would be in ARC, but if that isn't possible, the third party app is the other solutions that would allow this. it would be less tightly integrated with ARC but would use the hooks that are available in ARC to communicate directly with ARC via variables and port 6666 as the default. This would be configurable just like EZ-AI is configurable. It would be in the application.config file which would prevent the need for a database engine to be running.

Also, Richard R, that was a good question. I would want to know what I would stand to gain from this as well and it hadn't really been identified yet.

PRO
USA
#12  

Count me in for a donation. I would really look forward to this advancement/addition.

United Kingdom
#13  

@David.

Quote:

There would be no need for you to buy dragon.

That's good to know, and kinda makes sense to me now. Thanks for confirming. Consider my funding to go up to $200 then as this is the case;). Yeah I sure hope that DJ will jump in on this tread soon and share his thoughts on this.

A couple of thoughts to personal benefits I see of using Nuance DNS...

  1. Using ARCs speech rec works okay for me, but as my S/R command list grows ever larger, even with regular and consistent Windows V/R training, false positives and low confidence levels start to become more apparent. Also with use of the Pandorabot control using S/R and the full language library, accuracy levels are about 50% at best with a lot of my speech not being recognised at all, and that is with a good quality headset, lots of training, and mic levels set to a good level. Using my iPhone's S/R (also created by Nuance) via a VCN remote app to do this is simply a joy to use with accuracy of about 98%. I'm hoping using DNS will have similar accuracy levels for using the Pandorabot control.

  2. I don't know if the same thing stand for DNS computer software, but Nuance's iPhone recognition doesn't need user profiles to be set up, meaning that any person can pick up and use my iPhone's S/R and get great accuracy results immediately. I'm hoping DNS will be something similar which would mean anyone can talk to your robot without going through extensive Windows voice recognition training.

And the bottom line is that Nuaunce's Dragon Natural Speaking is highly rated and reviewed, and is widely used over the Windows S/R offering. To use this with ARC would be a great addition.

#14  

Like Richard R. I am undecided about this. And, like him, I have had no problem with the Windows speech engine. It's worked great right off the bat with zero training. While all the bells and whistles that DNS would bring on board are nice, the main point here is quality voice recognition. In that regard, While DNS will do a better job with the same input, will it really be enough to make all that much difference? Does anyone really know? In my view, the core issue is not so much the recognition engine as it is the quality of the audio delivered. I put my money into insuring the best quality of audio goes in first. Quality microphones and good microphone placement. Low noise amplifiers. Fast processing of the audio signal for best quality output. That sort of thing. Once those methods have been exausted, only then would I turn to better quality of recognition.

As an aside, to me, ultimately the goal is to place the audio pickup device in the robot itself. Here I would not rely on a combination speaker/microphone setup, but one, or more, small but high quality microphones with an on-board amp. At this point (for ARC anyway) the audio would be fed to ARC from the robot. This could eventually lead to an on-board processing unit that could be placed in the robot to handle the speech function. Voice processing is such an integral part of the robots we want to build after all. Nearly as integral as the ezb is for servo control. I realize placing a microphone in the robot seems like a rather round about way of doing things right now since it is the PC computer that actually does the recognition. But then, that's what we do with the camera. I also realize we have to do that with the camera in order to get the robot's perspective of what it sees, so why not also get the robot's audio perception of it's surroundings as well? Plugging the microphone into the computer is the direct route, of course, but the experience gained in talking directly to the robot will be invaluable later in the development process. It will seem more natural to people as well, while allowing for experimenting with the direction of the incoming sound. For instance the robot could turn to look at the person talking or other noises in the area. Follow a noise to find it's source. Even identify the noise by it's pattern, as we do now for the video.

#15  

The difference in the ability of dns vs ms are pretty staggering. This is why it is used in the medical field for transcription and dictation. The differences have also been tested by multiple users of this community. I personally have seen the differences as my father in law is a doctor and I have seen dns and how well it works. It is at least 2x more accurate than windows sr engine. If you do some research into what all has happened with MS speech recognition, it is easy to see why they have had the issues that they have had. This topic is really about if you would be willing to help fund the ability to choose which engine you want to use.

#16  

Fine, I get it. Too bad there is not a way to delete one's posts. All I can do is stop posting my opinions in the future.