Thumbnail

Bing Speech Recognition

Accurate Bing cloud speech-to-text for ARC: wake-word, programmable control, $BingSpeech output, Windows language support, headset compatible

How to add the Bing Speech Recognition robot skill

  1. Load the most recent release of ARC (Get ARC).
  2. Press the Project tab from the top menu bar in ARC.
  3. Press Add Robot Skill from the button ribbon bar in ARC.
  4. Choose the Audio category tab.
  5. Press the Bing Speech Recognition icon to add the robot skill to your project.

Don't have a robot yet?

Follow the Getting Started Guide to build a robot and use the Bing Speech Recognition robot skill.


How to use the Bing Speech Recognition robot skill

The Bing Speech Recognition robot skill for ARC uses Microsoft’s cloud-based Bing Speech Recognition service, which is one of the most accurate speech-to-text engines available. This skill converts spoken audio into text that can be used to control your robot, trigger scripts, or enable conversational AI.

Two Versions of This Skill

There are two versions of this robot skill: this one and Advanced Speech Recognition . This version uses the ARC Pro subscription cloud services. While the daily and monthly usage limits are very generous, if you exceed the ARC Pro query count you can switch to the Advanced Speech Recognition skill instead.

Microphone Recommendation

Robots typically generate significant background noise from motors, servos, fans, and speakers. For this reason, mounting a microphone directly on the robot is often not ideal. The recommended approach is to place the microphone on the controlling PC or laptop, on yourself, or somewhere in the room away from the robot.

Increasing microphone gain allows speech to be detected across larger rooms, but higher gain also increases the chance of false positives. Experiment with microphone placement and gain levels to find the best balance for your environment. For best results, use a headset or Bluetooth microphone rather than a built-in laptop microphone.

Main Window

Bing Speech Recognition Main Window

1. Start Recording Button
Starts Bing Speech Recognition. The skill waits for detected speech, captures your voice, converts it to text, and displays the result in the Response Display.

2. Audio Waveform
Provides visual feedback that your microphone is correctly configured and actively receiving audio input.

3. Response Display
Displays recognized speech as text, silence detection, and diagnostic messages. This log helps you fine-tune wake word detection, microphone levels, and recognition confidence.

Configuration

Bing Speech Recognition Configuration

Phrase List
A list of predefined phrases that the recognizer will attempt to match exactly. These phrases can be customized and expanded.

Not Handled Script
Executed when speech is detected that does not match any phrase in the Phrase List. This script is skipped if a phrase match occurs.

All Recognized Script
Executed for every detected phrase, regardless of whether it matches a phrase list entry. The recognized text is available in the $BingSpeech variable.

Start Listening Script
Executed whenever the skill begins listening. This is commonly used to turn on an LED, play a sound, or visually indicate that the robot is listening.

Variable Field
Specifies the variable that stores the recognized text. The default variable is $BingSpeech, which is global and accessible from JavaScript or Python using GetVar().

Auto Record Using Wake Word
Enables wake-word detection similar to home assistants like Alexa or Google Home. When the wake word is detected, the skill automatically begins listening.

Wake Word Sound
Plays a selected sound through the PC’s default audio device when the wake word is detected.

Min Wake Word Confidence
Sets the minimum confidence threshold (0.0 – 1.0) required to trigger the wake word. The default value is 0.75.

Play Wake Word Sound with ControlCommand()
Plays the wake sound when listening is started programmatically using ControlCommand().

Stop Punctuation
Removes punctuation from recognized speech to simplify text parsing.

Setup Microphone
Opens the Windows audio configuration dialog to select and adjust the microphone input device.

Max Recording Length (Seconds)
Limits how long the skill listens before stopping automatically, helping prevent false positives.

Language Drop-down
Selects the speech recognition language. Supported languages depend on the Windows Speech Recognition configuration installed on your system.

How to Use Bing Speech Recognition

A detailed tutorial demonstrates using this skill with PandoraBot AI for conversational robots. View the tutorial here.

Using Bing Speech for Conversational Input

To begin speech recognition, press the Start Listening button or trigger listening via a wake word or ControlCommand(). Recognized text is stored in the $BingSpeech variable.

Automatic speech detection using voice activity is unreliable in noisy environments. The recommended approach is a push-to-talk system, where listening is explicitly started and stopped via software or a physical button.

The example below uses the SCRIPT skill and a button connected to EZ-B port D0 (configured with a pull-up resistor).


while (true) {

  // Wait for the button press (pulls D0 low)
  Digital.wait(d0, false);

  // Start Bing Speech Recognition
  ControlCommand("Bing Speech Recognition", "StartListening");

  // Wait for the button release (D0 returns high)
  Digital.wait(d0, true);

  // Stop listening and begin transcription
  ControlCommand("Bing Speech Recognition", "StopListening");
}
  

Videos

Example usage combined with Cognitive Vision and Cognitive Emotion.

Requirements

Internet connection required. A second WiFi adapter or Ethernet connection may be needed. Learn more about dual network connections here.

Recommended Hardware

Headset or External Microphone

Headset Microphone

A headset or external microphone significantly reduces background noise and prevents the recognizer from hearing the robot’s own voice. This improves accuracy and reduces false positives.

Resources

Configure Audio Input Device

Microphone Settings
  1. Right-click the speaker icon in the Windows system tray
  2. Select Open Sound Settings
  3. Verify the correct microphone is selected and responding to audio
  4. Adjust microphone volume so normal speech peaks near the middle of the VU meter

Control Commands for the Bing Speech Recognition robot skill

There are Control Commands available for this robot skill which allows the skill to be controlled programmatically from scripts or other robot skills. These commands enable you to automate actions, respond to sensor inputs, and integrate the robot skill with other systems or custom interfaces. If you're new to the concept of Control Commands, we have a comprehensive manual available here that explains how to use them, provides examples to get you started and make the most of this powerful feature.

Control Command Manual

// Starts listening and returns the translated text. No scripts are executed. (Returns String)

  • controlCommand("Bing Speech Recognition", "GetText")

// Start listening and convert the speech into text. Sets the global variable and executes the script.

  • controlCommand("Bing Speech Recognition", "StartListening")

// Stop the current listening process.

  • controlCommand("Bing Speech Recognition", "StopListening")

// Pause listening so the wakeword (if enabled) does not trigger listening.

  • controlCommand("Bing Speech Recognition", "PauseListening")

// Unpause listening so the wakeword (if enabled) will trigger listening.

  • controlCommand("Bing Speech Recognition", "UnpauseListening")

// Return the status of the pause checkbox. (Returns Boolean [true or false])

  • controlCommand("Bing Speech Recognition", "GetPause")

Related Tutorials

Related Hack Events

Related Robots

Related Questions


ARC Pro

Upgrade to ARC Pro

Get access to the latest features and updates before they're released. You'll have everything that's needed to unleash your robot's potential!

Author Avatar
PRO
USA
#2   — Edited

Good morning, in the PandoraBot control, I put - Audio.say(getVar("$BingSpeech"));

I having it speak out of the PC

User-inserted image

in the Bing speech I put ControlCommand("PandoraBot", SetPhrase, $BingSpeech)

I use AIMLbot so I can write my own responses

for AIML bot it is about the same:

Bing speech - ControlCommand("AimlBot", SetPhrase, $BingSpeech)

AimlBot - Audio.say(getVar("$BotResponse"));

thanks  EzAng

#3  

Thank you very much for your prompt reply. It seems that I had to get a ARC update. It works now, but the update has different "conf"

User-inserted image

screen. It looks like the original voice recognition skill.

The new question is if I set my Bing speech to send messages to aimlbot, how can i use bing to do other things like send messages to congnitive vision or servos? It now responds both from the aimbot and the "phrase" and "command".

Author Avatar
PRO
Synthiam
#4  

User-inserted image

you’ll find labels and question marks next to options. This applies to all options across the entire ARC software

#5   — Edited

Thanks DJ.  That helped.

Have you ever seen this intermittent error:

Error in response received: There was an error during asynchronous processing. Unique state object is required for multiple asynchronous simultaneous operations to be outstanding. Error in response received: The underlying connection was closed: An unexpected error occurred on a receive.

Author Avatar
PRO
USA
#6   — Edited

What is the maximum monthly queries for this service? And what happens to signify to the user that he/she has used it all up? Asking for a friend :/

Author Avatar
PRO
Synthiam
#7  

It's a quote divided by users - so i think it's like 100 per day or something. The advanced bing speech recognition is what you'd want to use if you need more.

How you know is you get a message that says the quota is done.

Author Avatar
PRO
USA
#8   — Edited

Today...I get a message saying it can't reach the server. It's been working flawlessly for weeks and now I get a long hang with a return can't reach the server. I have double-checked that I am connected to the internet, which I am. Any other reason to get his message?

Can anyone else check to see if it's working? Maybe its maintenance on the server?

Edit: now working...go figure!