Thumbnail

Bing Speech Recognition

+ How To Add This Control To Your Project (Click to Expand)
  1. Load the most recent release of ARC.
  2. Press the Project tab from the top menu bar in ARC.
  3. Press Add Control from the button ribbon bar in ARC.
  4. Choose the Audio category tab.
  5. Press the Bing Speech Recognition icon to add the control to your project.

This is a speech recognition skill for ARC that uses the Bing Speech Recognition cloud service. It is one of the most accurate speech recognition services available.

Two Version Of This Skill

There are two version of this robot skill, this one and the Advanced Speech Recognition. This version of the robot skill uses a shared license key with Microsoft that enables ARC users to experiment and demo this robot skill. Because this version of the skill shares a license key, users may encounter errors if more than one ARC instance is using this skill. For serious testing and development, we recommend setting up your own key with Microsoft by using the Advanced Speech Recognition instead.

Microphone Recommendation

Most robots make a lot of noise, so locating the audio input device on a robot is not a practical solution. It is best to locate the microphone on the controlling PC/Laptop, on yourself, or somewhere in the room (away from the robot). Turning the gain higher on the input device will allow voices to be recognized from across large rooms but will also increase false positives. Test with different gains for the best resolution. Experiment with different microphone locations and volumes for the best setup of your environment. Ideally, use a headset or Bluetooth mic rather than your laptop microphone.



Main Window




1. Start Recording Button
This button starts the Bing Speech Recognition, it will detect silence until you speak, then it will detect the words you are saying and display them in the Response Display.

2. Audio Waveform
This gives visual feedback that your audio input device (microphone) is configured correctly and is picking up voice/sounds.

3. Response Display
Here you will get speech recognition feedback. It will show the text version of your detected words or silence.

Configuration




1. Phrase List
 This is a list of default phrases with the ability to customize and add more phrases.

2. Command List
 This is a list of default commands, corresponding to the phrases in the same row, with the ability to customize and add more commands.

3. All Recognized Script
 This script will execute for every detected phrase. If there is no match for recognition, this script will still be executed but the variable will be empty. You can tell if there was no speech spoken because the variable will be empty and this script will be executed.

4. Variable Field
 This variable holds the text from the speech recognizer. This may be used in your script for determining what was spoken. If the variable is empty, it is because there was no speech recognized (i.e. silence).

5. Auto Record Checkbox
 If this checkbox is enabled, the skill will begin recording audio when it has been detected. The threshold is adjusted in the Level Threshold setting.

6. Level Threshold Field
 This adjustable threshold level is used for both Auto Recording (if enabled) and stopping recording. If you find that even in manual mode the recording manually stops too quickly before your sentence is complete, lower this value.

7. Silence Count Field
 This count is used to set how many times the energy volume is below the level threshold before recording is stopped.

8. Max Recording Field
 This represents the maximum duration of the recording. *Note: Currently locked to 10 seconds.

9. Strip Punctuation Checkbox
 If this checkbox is enabled punctuation will be stripped from the detected recording when displaying it on the Response Display. This is helpful when you're looking to detect certain words or phrases.

10. Setup Microphone Button
 This button is a shortcut to the properties of your installed audio input devices. Verify that your device is working by watching the soundbar for movement when you speak into that device.

11. List Management Buttons
 These buttons manage the rows of phrases. They move the rows up and down, insert them, add more to the bottom, and delete them.

12. Language Drop-down
 ARC uses the Microsoft Speech Recognition included with Windows. All languages supported by Windows Speech Recognition are also supported in ARC. You can configure Windows to listen to any language. ARC will default to EN-US (English) language if installed. Otherwise, ARC will default to the first installed language. If more than one language is installed, a language may be selected with this drop-down.

*Note 2: Languages support by speech recognition depend on the Microsoft Windows operating system configuration. View the Microsoft speech recognition guide here to view supported languages.

13. Not Handled Script
 This script will execute for every detected phrase that is not in the phrase list. If there is a match from the phrase list, this script will not be called.

How to Use Bing Speech Recongition


There is a detailed tutorial that provides an example of using this robot skill in conjunction with PandoraBot AI to have a conversation with your robot. You can find the tutorial by clicking here.



Control Commands

Control commands will send instructions from one robot skill to another. Read more about the ARC control command feature by clicking here.

ControlCommand("PandoraBot", SetPhrase, $BingSpeech)

When this skill detects a phrase, it will be assigned to the variable and the specified script will execute. With this model, you can have this skill send the detected speech to a PandoraBot skill using ControlCommand().

ControlCommand("PandoraBot", PauseListening)
ControlCommand("PandoraBot", UnpauseListening)

Controls the state of the PAUSE checkbox which pauses the listening for VAD auto recording. If VAD is disabled, the PAUSE checkbox does not exist. The pause checkbox prevents the VAD from automatically starting recording.

ControlCommand("PandoraBot", StartListening)

Presses the Start button which begins listening to audio through the microphone to convert from speech to text. Sending this control command is the same behavior as pressing the Start button.

ControlCommand("PandoraBot", StopListening)

Presses the Stop button which stops the recording of audio that will be converted from speech to text. The Stop button is only visible when the recording is active. Recording can be active by either pressing the Start button or through auto-recording when VAD is enabled.



Videos




Here's an example of the skill in action combined with the Cognitive Vision and Cognitive Emotion services.

Requirements


This service requires an internet connection which means that a second USB WiFi adapter or an ethernet connection may be needed. Read about having 2 network connections here.

Headset or External Mic


Using a headset or external mic will produce dramatically better results compared to the internal PC/Laptop mic. Using a headset or mic will enable the recognition engine to "hear" your voice much clearer with less background noise. The background noise of the laptop, motors, radio, and room echo will cause the recognition software to return False Positives. This means the software recognizes an incorrect phrase. An external mic will also prevent the recognition software from hearing the robot speak. In short, it is important to use a Mic Headset or external Mic for a positive Speech Recognition experience.


Resources


Configure Audio Input Device

You might have to adjust the microphone input volume/gain. To adjust the mic volume use the Microsoft Windows volume mixer, and first, make sure you have selected the correct input device. Your laptop or computer may have a few different mic devices. Maybe one is on a remote camera. Find the mic you'd like to use and adjust the volume. To find the volume settings that are ideal on your computer, follow these steps:

1) Right-click on the little speaker on your system tray

2) Select "Open Sound Settings"

3) In the "Input" section of the Sound Settings, you'll notice a little VU meter beside the active device. Make sure your active device is indeed the microphone you want to use. By making sounds, the VU meter should move.

4) Click on the "Device Properties" and locate the volume slider for the microphone. We usually have our volume set for 78. Play around with different volumes until you see your voice is being picked up by the VU meter. Adjust the volume input level/gain to display your voice's regular volume near the middle of the VU Display graph. If the level/gain is too high, the recognition software will not work because the input audio will be distorted.

Voice Training
You may train your computer for speech recognition by using the training wizard. Find the training wizard under Speech Recognition within the Windows Control Panel.


Related Content

tutorial
Speech Recognition Tutorial

Speech Recognition Tutorial

Speech recognition is becoming a very popular way to control robots. This tutorial will explain the EZ-Builder speech...
question
Use Voice Recognition For Unsupported Languages

Use Voice Recognition For Unsupported Languages

I am working on a science project and I have had good contact with problems but there is one problem I cannot do well -...
question
Anyone Having Issues With Bing Speech Recognition?

Anyone Having Issues With Bing Speech Recognition?

I have downloaded the latest EZ Builder beta and trying to use bing speech rec and I keep getting the same error. Anyone...
live hack
"Robot Learn A New Object"

"Robot Learn A New Object"

I'll be using the camera and speech recognition to instruct the robot to learn a new object. I'll demonstrate how new...
tutorial
Vision Training: Object Recognition

Vision Training: Object Recognition

Learn how to program a robot to learn and memorize an object.
live hack
Treat-O-Matic 2020 Live Hack Part #5

Treat-O-Matic 2020 Live Hack Part #5

Let's make this Halloween contactless! With the advent of the COVID-19 pandemic, this Halloween is likely to look a bit...
live hack
Treat-O-Matic 2020 Live Hack Part #6 The Finale

Treat-O-Matic 2020 Live Hack Part #6 The Finale

*Sorry I'm going to have to cancel, still debugging things and I won't have time to engage. Instead, I will make a...
question

Is There Any Way To Detect If The Bing Speech Recognition Is "Paused"?

Is there any way to detect if the Bing Speech Recognition is "paused" or not? Alexa lights up LEDs when she is "listening". I wanted to write a...
live hack
D-0 Droid Live Hack

D-0 Droid Live Hack

Well, well, well look at me, so many unfinished hacks and I'm starting another one! In my defense, May the 4th is coming...
question
What Is The Difference Between "Pauselistening"  And "Stoplistening"

What Is The Difference Between "Pauselistening" And...

I have been using these two commands together, one following the other in my scripts. So far it has worked, but I really...
question

Robot Skills Reach Daily Limit

I am new to ARC.  I purchased the premium subscription but was disappointed that it keeps kicking me out of skills as I'm building because I reach...
question

Stopping Bing From Recording

Something has changed I guess. Bing Speech Recognition used to have controls that let the software turn off recording using these settings in the...
PRO
USA
#2   — Edited
Good morning, in the PandoraBot control, I put - Audio.say(getVar("$BingSpeech"));

I having it speak out of the PC

User-inserted image



in the Bing speech I put ControlCommand("PandoraBot", SetPhrase, $BingSpeech)

I use AIMLbot so I can write my own responses

for AIML bot it is about the same:

Bing speech - ControlCommand("AimlBot", SetPhrase, $BingSpeech)

AimlBot - Audio.say(getVar("$BotResponse"));


thanks 
EzAng
#3  
Thank you very much for your prompt reply. It seems that I had to get a ARC update. It works now, but the update has different "conf" User-inserted image

screen. It looks like the original voice recognition skill.

The new question is if I set my Bing speech to send messages to aimlbot, how can i use bing to do other things like send messages to congnitive vision or servos? It now responds both from the aimbot and the "phrase" and "command".
PRO
Synthiam
#4  
User-inserted image


you’ll find labels and question marks next to options. This applies to all options across the entire ARC software
#5   — Edited
Thanks DJ.  That helped.

Have you ever seen this intermittent error: 

Error in response received: There was an error during asynchronous processing. Unique state object is required for multiple asynchronous simultaneous operations to be outstanding.
Error in response received: The underlying connection was closed: An unexpected error occurred on a receive.
PRO
USA
#6   — Edited
What is the maximum monthly queries for this service? And what happens to signify to the user that he/she has used it all up? Asking for a friend :/
PRO
Synthiam
#7  
It's a quote divided by users - so i think it's like 100 per day or something. The advanced bing speech recognition is what you'd want to use if you need more.

How you know is you get a message that says the quota is done.
PRO
USA
#8   — Edited
Today...I get a message saying it can't reach the server. It's been working flawlessly for weeks and now I get a long hang with a return can't reach the server. I have double-checked that I am connected to the internet, which I am. Any other reason to get his message?

Can anyone else check to see if it's working? Maybe its maintenance on the server?

Edit: now working...go figure!
PRO
USA
#9  
Edit 2: Now its not again....I'm getting:

Server was unable to process request. ---> A task may only be disposed if it is in a completion state (RanToCompletion, Faulted or Canceled).
PRO
Synthiam
#10  
Maybe their server is having issues. Works fine now.

User-inserted image
PRO
USA
#11  
Ok DJ thanks for checking. I’ll try again in the AM.
#12  
DJ,

Is there any way to directly set up Bing Speech Recognition to use a wake word?

Thomas
PRO
Synthiam
#13   — Edited
In the Speech Recognition robot skill, add a phrase with a ControlCommand() that unpauses the bing speech recognition?

And then have the bing speech recognition re-pause itself after it detects a phrase?

Use the Control Command, here's a manual on how to use the ControlCommand and access available commands for each robot skill: https://synthiam.com/Support/Programming/control-command

Essentially, you can right-click in the editor or press the Cheat Sheet tab


PS, the question had been moved from Will's unrelated thread about his youtube channel to here.
#14  
@TMesserschmidt, DJ's suggestion is exactly how I set up my robot with a wake word. I used the name "Robot". Not real creative bit it works. LOL. 

I also set a set of lights to flash for the amount of time Bing was listening. That way I know that Bing is actively listening and for how long. That really helped.
#15   — Edited
Here's the process of how I added a wake word. 

Bing is normally paused and wont listen till the script in the Speech Recognition robot skill starts it or you push it's Start Recording button.. 

Simply add the Speech Recognition robot skill to your project. Make sure it's working and listening. 

Add this Ez  code and modify it to work with your project. 

Code:

Set(D12, on)   #D12, on turns on the scanner lights to let me know Bing has been called to listen,
ControlCommand("Bing Speech Recognition", StartListening) #Starts Bing listening
ControlCommand("Speech Recognition", PauseMS, 2000) #Paused this Speech Recognition control so it wout listen for a while
Sleep(1500)  #Keeps script alive so the scanner lights will let me know how long Bing is listening.
Set(D12, off)  #D12 turns off the scanner lights
TIP: If you get false startups of this control just increase the Continence level until it hears the wake word and nothing else. Use a wake word that is not common or is not used too much around the robot. 

Nothing else should be in the Speech Recognition robot skill except this wake word. 

Works just like Alexa! 

Have fun!!
PRO
USA
#17  
I'm having an issue with VAD. When its on I get errors, mostly it hangs sending for a long time then returns an error: a task may only be disposed if it is in a completion state: ran to completion, fault or canceled.

If i turn off VAD i do not get the issue. But then have to press stop recording button which defeats the purpose.

Can anyone repeat this issue? Just about to give up on this and move on to another skill.
PRO
Portugal
#18  
No problems here. Just used the skill with "Auto Record Checkbox".
PRO
Synthiam
#19  
Will, are you using this robot skill on 2 or more ARC instances at the same time? That might cause the issue. This free version of the recognition uses a shared key and has a limit to how many users can use it at once. There's also a limit that each user can only be used once at a time. If you're using a production or dev environment, I'd recommend using the advanced speech recognition here: https://synthiam.com/Support/Skills/Audio/Advanced-Speech-Recognition?id=15894

That way, you're in charge of the key do not need to depend on sharing keys with other users.