United Kingdom
Asked — Edited

Cortana Speech Recognition Integration With ARC

We have managed to integrate Cortana unlimited (same as dictation mode) speech recognition with ARC which now means we can input any speech into ARC (for no cost unlike Dragon). we are using the HTTP custom server as can be seen on the screendump below.

User-inserted image

We added a Pandorabot to ARC (to use it with unlimited speech) but it always seems to use it's own (default MS) speech recognition and we can't see a way to send our speech string into the Pandorabot.

Can DJ or anyone advise how we may do this - running Pandorabot with reliable/accurate unlimited speech recognition would be very neat!

Thanks in advance for any help.

Tony


ARC Pro

Upgrade to ARC Pro

Discover the limitless potential of robot programming with Synthiam ARC Pro – where innovation and creativity meet seamlessly.

#1  

Exciting! ! I hope someone can come up with something m!

United Kingdom
#3  

DJ, at the moment we are just trying to see if we can use Cortana as a speech recognition input for ARC giving us a diction type mode and not be stuck with grammar (limited vocabulary) mode. My plan is to integrate with Pandorabot where we can ask any questions not pre-defined ones like grammar mode forces. I have been playing with Cortana speech recognition for a few days now and get around 99% accuracy it is surprisingly good!

Can you give me a bit more info (code snippet etc) on the send text command control required. As always thanks for your help here.

Tony

PRO
Synthiam
#4  

Cortana says it uses the same speech engine as recognition in their documentation. Strange that you would get different results.

The ControlCommand() syntax for any control can be viewed in the Cheat Sheet tab when editing script. Add the pandora bot control and when editing a script, check the Cheat Sheet tab for example.

here is a link that explains the Cheat Sheet and ezscript editor: https://synthiam.com/Tutorials/Lesson/23?courseId=6

United Kingdom
#5  

DJ, I think you are possibly mistaken, the Cortana speech recognition is cloud based and not PC based and I believe is a much better SR engine - my reasoning is detailed below.

"Cortana’s speech recognition is actually a cloud-based system, where blocks of speech are submitted to the cloud for translation"

The above is referenced here

http://www.develop-online.net/tools-and-tech/how-windows-10-and-cortana-are-bringing-speech-recognition-to-games/0215391

Cortana also passes speech through a NLP (natural language processor) filter which obviously would improve SR engine output.

"The natural language processing capabilities of Cortana are derived from Tellme Networks (bought by Microsoft in 2007) " from Wikipedia

Cortana has to have a better SR engine as

Talking to the Pandorabot via the ARC (PC based SR) I get about 80% accuracy and it also hears itself which causes false recognitions - if I disconnect from the net SR continues to work proving that it is (in my opinion the not very good at dictation) internal SR based on a derivative of the Microsoft SR engine 6.1.

Talking to Cortana through my app yields 99% accuracy (similar to accuracy from Dragon) - if I disconnect from the net SR stops working proving that its cloud based.

My Cortana based SR also waits for its name which is important to stop false recognitions when interacting with the Pandorabot.

I may be wrong here, but I then cannot explain the huge difference in performance that I am seeing?

Tony

#6  

Yes, cortana is 100% cloud based and runs through a service on the computer that handles this. The services is quite a pain to get turned off if you don't want it running and consuming resources on the computer. The service is quite bloated but cortana does work fairly well without training. It is also free to use if you upgraded to windows 10. If you didn't, this upgrade to windows 10 is no longer available for free. Also, if you have an older mac running bootcamp, windows 10 isn't an option.

PRO
Synthiam
#7  

That's what I originally thought, but some Microsoft documentation had led me astray sometime ago. This page: https://msdn.microsoft.com/cortana/getstarted

Says this...

Quote:

Windows speech

Windows speech is a set of UWP APIs that enable both speech recognition and speech synthesis across multiple languages on all Windows-10 based devices, including IoT hardware, phones, tablets, and PCs.

Cortana on Windows uses these speech APIs.

Perhaps what they are failing to say correctly is Cortana uses the speech API for synthesis, not recognition.

Lastly, if you want pandora bot to be disabled from listening to voice commands, simply pause the control with the checkbox. View available control commands using the Cheat Sheet as previously stated.

#8  

Interesting that Cortana uses TellMe. I thought Microsoft had sold them. Maybe just sold the commercial IVR business and kept the technology. I'll need to do some research (in my last job I did some work with M$/TellMe on a partnership that fell apart shortly after I was laid off.

Alan

United Kingdom
#9  

David, yes when running my Cortana app the resources are up a bit.

With my Cortana app (and ARC) running I get 3 to 8% CPU

With it not running I get 1 to 4% CPU

This is on an i3 Windows 10 Acer micro desktop PC

The main thing is that I seem to be getting SR (dictation) accuracy that is close to Dragon and its free. The ability to say anything into ARC has some good applications and gets it away from having to use pre-defined grammers.

Alan, the TellMe info came from this wikipedia page

https://en.wikipedia.org/wiki/Cortana_(software)

Tony

#10  

Tony, Right. If you had an Atom you would see higher load. It would be up around 10% without running anything just because the service is running on the device. Once the exe is yanked out from under the os you get down to about 2%. This is without running anything at all through the SR engine in both instances.

As you say, it is good for free, if you upgraded to windows 10. The issue with adding it to ARC as a core component is now the software would have to require windows 10, which not everyone was keen on. This is one of the reasons that we didn't use it in our project. We wanted to run on the same platforms that ARC supported. Unfortunately, to get from pre-windows 10 to windows 10 is no longer free, so using cortana is no longer free for people who didn't upgrade. I just wanted to be sure this was out there somewhere so that people would know. I also work with others who have mac's and run bootcamp which doesn't always allow you to upgrade to the latest version of windows. I know that 2010 model macbooks wont allow you to go above Windows 7. If this was placed in as a core piece of ARC, you would then have to make known to these users that their hardware can no longer support all of the features of EZ-Robot or write a component that would simulate the purpose of the service that is running that would be compatible with the older OS's. That could be done and might be done by someone at some point, eliminating the concern.

I am fine with it either way. I just wanted to make it known to people before a huge push happens that requires or recommends that ARC run on Windows 10. I run all different types of OS's and my windows computers are windows 10, except for the computers going into my school. It is easy enough to find purposes for these computers aside from ARC though but just trying to make sure all of the cost information is available...

United Kingdom
#11  

David, some good points there and you are right it will only work on Windows 10 upwards when it comes to PC, tablet etc.

But I am really surprised how well it works, I thought it would but did not expect this accuracy without any prior training.

Tony

#12  

Sounds good to me...I am operating on an upgraded Windows. Anyways, it would not be a core feature it would rather be a plugin right? ARC does not run on Vista btw... ;)

#13  

It depends on what DJ decides to do. It could be a plugin, or made to be a core component. DJ builds the core components and anyone can build a plugin.

United Kingdom
#14  

DJ, it may be my age, or maybe as I get older I am getting a bit thick! I have spent a lot of time trying, but I just cannot see any way to add an external SR input to pandorabot via ARC. There is a script for the output side but this is no use here.

Are you sure there is a way? Maybe you misunderstood what I am trying to do, it may be not possible at present.

The recognised speech string arrives in the HTTP custom server as seen below, this string then needs to somehow get into pandorabot

User-inserted image

Obviously in the string the %20 is the space (blank) character.

I would greatly appreciate any advice/example that you can give.

Tony

United Kingdom
#15  

Ok, not receiving any reply/answers, then I must be correct that there is no way to convert the (HTTP custom server) received cortana SR phrase data into a text string that can then be used in pandorabot.

I must say that I am surprised that there seems to be no interest in linking cortana (unlimited SR) into ARC, I guess I will now need to do this outside ARC (in my own custom software) if the link with pandorabot is not possible.

Tony

#16  

This article from Computerworld tends to indicate Cortana is spyware. Not to mention hard to rid yourself of. Perhaps people are wary of using it.

#17  

@ToyMaker, I'm very interested in this. I have not had time to play around with it, you seems to be getting results, even if they are not working correctly for you.

One thing I do struggle with a bit is if I do use free speech recognition outside of the defined speech recognition control in ARC, what would our robots do with that? We would still need to hard code recognized words or phrases and responses for those, correct?

#18  

What's really needed, as far as the Pandorabot Control is concerned is an input variable. DJ added an output variable to the control such that it's output could be displayed or otherwise used as desired. If he would now add an input variable as well, it would allow you to put the response from Cortana into Pandorabot. After appropriate massaging to extract only the words themselves, of course.

United Kingdom
#19  

@WBS00001, you are right we need an input variable - but unfortunately thats probably not going to happen.

Tony

PRO
Synthiam
#20  

Please use the Cheat Sheet tab when editing scripts to see all available ControlCommands for any controls in your project. As mentioned, the Cheat Sheet tab should always be referenced to identify control command's for each control in your project.

User-inserted image

If the PandoraBot control is a subject of the question, view the Cheat Sheet when the pandora bot control is added to your project. You find out more about the Cheat Sheet and EZ-Script in the activities of the Learn section of this website. Here is a direct link: https://synthiam.com/Tutorials/Lesson/23?courseId=6

The Cheat Sheet command you are looking for is


ControlCommand("PandoraBot", SetPhrase, [phrase])

In the future, please use the Cheat Sheet tab - as i have replied numerous times on this topic.

#21  

@DJSures So there IS a way. My sincere apologies for not looking before I leapt. In penance, I shall have my JD, Skippy, repeat the phrase "Check the Cheat Sheet stupid." and look at me chidingly, while stamping his foot every hour for the next two days. Or until I knock him off my desk with a baseball bat, which ever comes first.

Still, ya know ... you could have just pointed that out in the first place and saved all this fuss.

PRO
Synthiam
#22  

I did numoerous times in this thread. I always direct any questions to the answer. Please visit response 3, 5 and 8. All of which I repeated to check the cheat sheet, which is why i mentioned it a forth time in the most recent response.

4th time is a charm, I guess :)

United Kingdom
#23  

OK, spent all morning and still cannot see how (ControlCommand("PandoraBot", SetPhrase, [phrase]) works or can get the cortana SR into pandorabot?

I get the Cheat Sheet thing, but to me it is not a lot of good without examples of how to use some of the more obscure EZ functions like pandorabot.

I can find no examples anywhere of how to get a string input from the HTTP Custom Server into pandorabot (or anything similar), so I tried everything that I can think of with no luck.

I know when I am defeated.

Please do not worry about replying - I have spent days on this with no results so I am now going to give up on the whole thing - It may be possible to do, and it may just be me being a bit thick, but I do not have the time to keep messing around with it and getting no where.

Anyway DJ thanks for attempting to help me, I really appreciate it, but I think I now need to close this thread and do what I want another way.

Tony

PRO
USA
#24  

@Tony,

Can you show where do you setup the custom http server url (cortana side) ?

PRO
Synthiam
#25  

Of course it is possible, which is why I repeat the same response. The script that you send to the http server or telnet server is the control command.

For example, imagine that you wanted to send the phrase "hello world" to the pandora bot control. After reading the ezscript manual link that I provided earlier, you will see that the ControlCommand() does that. ControlCommand() sends a command to a control, hence the ControlCommand() naming convention.

Every control, as mentioned in the manual, accepts different commands. Commands are different because the camera control and the pandora bot control have different features. Sending a motion detection command to the pandora bot control would not make sense.

So to view each control command, use the cheat sheet.

Here is how you would send "hello world" to the pandora bot control...


ControlCommand("PandoraBot", SetPhrase, "hello world")

Here is how you would send "I am a banana" to the pandora bot control...


ControlCommand("PandoraBot", SetPhrase, "I am a banana")

Here is how you would send "the sky is blue" to the pandora bot control...


ControlCommand("PandoraBot", SetPhrase, "the sky is blue")

PRO
USA
#26  

@DJ,

I presume a Cortana app is calling the ARC Custom Server http url with a text.

Tony's issue is how to integrate the ControlCommand with the Custom Http Server.

PRO
Synthiam
#27  

That answer is even easier by using the HTTP server and viewing the script manager page. There's even the Cheat Sheet displayed. Here is a screenshot...

User-inserted image

And as you can see by reading the included instructions when using the HTTP Server, you can execute scripts via the http server. For example...


http://192.168.0.169:88/Exec?password=admin&script=ControlCommand(%22PandoraBot%22,%20SetPhrase,%20%22what%20is%20the%20weather%20like%20today%22)

And, my answer to the question about PandoraBot hearing itself had also been ignored so i'll expand on that as well. You can PAUSE the control by either using your hand on the mouse and selecting the checkbox on the PAUSE option - or - you can use the ControlCommand()


ControlCommand("PandoraBot", PauseOn)

PRO
USA
#28  

I just tested: added http server control, port 8888 pandorabot control (default ez bot id) pandorabot control pause on

chrome: url (without the correct encoding chars)


http://192.168.18.65:8888/exec?password=admin&script=ControlCommand("PandoraBot", SetPhrase, "who are you") 

it works!

i missed some steps:

  1. open the Http Server url
  2. press Script Console link
  3. read the paragraph:

Quote:

Alternatively, if you wish to send EZ-Script commands directly through a web or HTTP interface, the url can be formatted as:

#29  

I looked at the Cheat Sheet for the HTTPServer controls (both of them) first this time ;) But there are only a couple of commands available. I'm aware of the method of executing commands via a properly formatted string sent to the HTTPServer control and have used it, but looking at the interface from Cortana, as Tony posted earlier, there doesn't seem to be a way to format what it sends to the HTTPServer control. All he can do it send whatever Cortana hears. That would be the problem now as I see it. He would need a way to get at what is displayed on the "Log" screen in the Custom HTTPServer control. Basically get it into a variable so it can be filtered to get just the text desired, then pass it to the Pandora input via the SetPhrase instruction.

PRO
USA
#30  

I asked for more details post #25.

I believe the window "cortana test" post #1 is a custom developed app.

If it's the case, the code can be modified to send the correct http script and query string parameters.

if is not a custom developed app, maybe there is a config file to customize the URL.

Only Tony knows.

United Kingdom
#31  

Well after a huge amount of effort we finally got something working with the cortana unlimited SR and the EZ-Robot Pandorabot. A big thanks to my genius friend Mike Hodgson whose input and help did so much to get this all working!

With cortana SR input I can say anything I want to the chatbot as this video shows - its all a bit rough though as it is one of my first (vocal) chats with DJ's Pandorabot and I could not think of what to say!

Tony

PRO
USA
#32  

Brilliant Tony! It's great to see it all working together. Amazing accuracy!

PRO
Synthiam
#33  

That recognition is very impressive! Are you releasing the module as a plugin or a standalone app? Really glad to see you got it working!

#34  

Wow, that really is fast at answering your questions, nice work!

Are you planing on sharing this information or is it going to stay private?

With EZ AI down or not released this would really be good.

Thanks for sharing your video.

Cheers

United Kingdom
#35  

DJ, glad you like - it's been fun chatting with your chatbot!

@merne - Windows 10 has made us have to jump through huge loops to even get this far, it is such early days and we need to do a lot more work before we can release anything.

I must say talking directly to a chat bot is very cool and to my way of thinking is the way to go for our robots.

I have become interested in Pandorabots because of Steve Worswick who has won the Loebner prize twice now 2013 and 2016 with his chatbots. His amazing award winning chatbot Mitsuku is all AIML and shows just how far this type of chatbot can go.

Tony

PRO
USA
#36  

I think you are right Tony, with regards to a companion, it's nice to have that kind of interaction vs just having canned question and answers. There is something natural to the flow. I downloaded the Cortana app for the iPhone, its very fast and accurate.

PRO
USA
#38  

Amazing article. I'm keen on trying out his chatbot now. I read about the Loebner Prize when doing research on Alan Turing, whom Alan ( robot) is named. I have a feeling with the explosion of AI, a winner of the contest will arise soon :)

#39  

@Toymaker Yeah I installed window10 with their free upgrade, it is a learning curve for sure.

I used pandorabot in 2014 but it kept locking up my pc. I like how fast the response time is when you're speaking to your robot. Good luck on you success.

Cheers

#40  

Hey this is really really cool! Its still kind of tiring to have a conversation with something that does not even know what a conversation is...but the setup runs super smooth, so lets wait and see what's maybe already waiting around the next corner with the new generations of Chatbots to come! ;)

#41  

So, I'm wondering and it may be impossible but..... I've not tried Chatbots but the VR accreacery seem real high with a quick interface. Is it possible to have it trigger scripts within ARC like we can now do with the VR built into Windows? I'd like to use Chatbots VR not as a conversation tool but as a trigger for EZ scripts and sound files. My robot being a reproduction of an original TV series robot, He's dependent on sound files of the voice actor who did his voice in the series (who is now deceased) to give him authenticity. I also use the VR in ARC to script motor movement and lights. I don't like the accuracy of the windows VR and this Chatbots interface looks like a great replacement. confused

EDIT: After thinking about it I think i'm a little off here. Seems like I need to use only the Cortana Speech Recognition to trigger the scripts and voce files and cut out the Chatbots? confused

United Kingdom
#42  

Dave, I am looking into this and will let you know as things develop.

Here is second quick (vocal) chat with DJ's Pandorabot, it is quite a lot of fun talking to a chatbot. The cortana SR makes one mistake in this session (it seems to rarely make mistakes) which is the word good - strange as it is such a simple word?

Next job is to start building my own Pandorabot thats geared towards the ALTAIR robots.

Tony

PRO
Synthiam
#43  

You can encode ezscript in the pandora bot responses. The pandora bot manual page explains more: https://synthiam.com/Tutorials/Help.aspx?id=189

PRO
USA
#44  

The speed and accuracy , are what made me give up on it. But this changes things completely. I know I'm repeating myself. I'm just floored by the whole set up. Anyone here want to try and take a crack at this as a plug in or ?!

PRO
Synthiam
#46  

Because it's UWP, it's not a compatible plugin. Microsoft is removing c# functionality and removing desktop widget functionality with UWP. The contana libraey only works UWP for some reason - it's an unusual direction to take. Considering UWP is more mobile friendly, but Microsoft doesn't actually have mobile devices anymore (other than the odd automotive gps GUI). Windows has been known for desktop usage and desktop style apps etc... they can't imagine solidworks, for example, being developed with UWP...

So it's a bit confusing direction they are taking. Contana is UWP and doesn't have desktop compatible libraries.

Guess in time we will see what's up Microsoft's sleeve:) sure they have something planned that makes more sense than it does now. It's too early to tell... and we're all eager! :D

PRO
USA
#47  

ah ok now I understand. We will just have to sit and wait...see what they come up with. Hopefully the solution will give us a similar or better result than Tony has shown!

PRO
USA
#48  

Brainstorming post.

Cortana provides via cloud services:

  1. Voice Commands e.g. Write an email to John saying Happy birthday.

  2. Voice dictation e.g. Taking notes, writing emails (Speech To Text)

  3. Natural Language Processing e.g. What's 17 minus two?

  4. Custom Actions can be useful to create interactions like: Hey Cortana switch off the lights Hey Cortana go to the kitchen Hey Cortana move the camera to the right.

  5. Voice dictation so far only makes sense if you have Bot components like Pandora. What you do with a long text ?

  6. NLP is handled by Microsoft, there is no way to feed specific content.

I see a huge potential to integrate Cortana's voice recognition with AI Bots e.g. Pandora Bots, besides custom actions what other kind of integrations can be done with ARC ?

#49  

This might be a different way to work with Cortana:

Germany
#50  

Hello Tony, I try to emulate you for 2.5 months. Unfortunately I am not successful, the hints in this thread are not enough, I come no further. Will you ever share your speech recognition revolution with others?

Thank you for your response.

Sven