Advanced Speech Recognition

Name: Advanced Speech Recognition
Author: DJ Sures

by Microsoft

Advanced Azure-backed speech-to-text for ARC allowing custom Azure Cognitive Service keys, scripting hooks, and configurable output variables.

Requires ARC v42 (Updated 1/31/2025)

Compatible with: Compatible with Microsoft Windows 10 or 11

Windows

How to add the Advanced Speech Recognition robot skill

Load the most recent release of ARC (Get ARC).
Press the Project tab from the top menu bar in ARC.
Press Add Robot Skill from the button ribbon bar in ARC.
Choose the Audio category tab.
Press the Advanced Speech Recognition icon to add the robot skill to your project.

Don't have a robot yet?

Follow the Getting Started Guide to build a robot and use the Advanced Speech Recognition robot skill.

How to use the Advanced Speech Recognition robot skill

Jump to Control Commands list
Jump to comments
Jump to end

This is an advanced alternative to the Bing Speech Recognition robot skill for ARC. It allows you to specify your own Azure Cognitive Service credentials using the Bing Speech Recognition cloud service, which is by far the most accurate speech recognition service we've ever used. Due to its nature, this robot skill is meant for advanced users who wish to write scripts to customize the behavior. For most use cases, it is recommended to use the Bing Speech Recognition robot skill because it has several easy-to-use features.

Main Window

1. Start Recording Once you have entered the API Key in the advanced configuration, you can begin translating speech to text using this button.

2. Audio Waveform This gives visual feedback that your audio input device (microphone) is configured correctly and is picking up voice and sounds.

3. Response Display and Log Here, you will get feedback from the Speech Recognition robot skill. It will show the text version of your detected words and any log messages for debugging, etc.

Configuration

1. Language The language that will be used for recognition.

2. Variables The variable that holds the text from the speech recognizer. This may be used in your Script to determine what was spoken.

3. Script This script will execute for every detected phrase. It will not be called if there is no match for recognition.

4. Settings The settings section has several options for customizing the translation result. The Strip Punctuation checkbox removes commas, periods, question marks, etc., from the result, making string comparison easier. The Make All Responses Lowercase checkbox does the same, aiding string comparison in your code.

How to Use Advanced Speech Recognition Skill

You will need a Microsoft Azure account and an API key to use this robot skill. There are many steps to create an Azure key, which isn't very easy for the average person. That is why we made a free version available here. However, you can create an Azure account to unlock this skill's full functionality and manage your paid subscription. The Azure account allows you to create a Cognitive Speech Recognition service. Locate the Region and API key to use in the Advanced tab of this skill. You can create an Azure account by clicking here.

Add a Cognitive Services resource to your Azure portal account.
Click Add and search for the "Speech" service in the Marketplace.
Create the Speech service and then enter in a name (ex: Test), a subscription (ex: Pay-as-You-Go), Location (ex: West US), pricing tier (ex: Free 0) and create a new resource group (ex: BingSpeechTest).
Once your deployment is complete, click the "Go to resource" button.
In the left menu, select "Keys and Endpoint".
Click the "Show Keys" button and copy the generated key in the KEY 1 field.
If you don't remember the location you selected above, you can find the location in the Location field.

Add the Advanced Speech Recognition Skill to your ARC project (Project -> Add Skill -> Audio -> Advanced Speech Recognition).
In the Advanced tab of the Advanced Speech Recognition Configuration menu, enter the region and paste the API key.

Save your configuration and start speaking to activate the speech recognition.

*Note: When not in use, ensure that the Advanced Speech Recognition skill is PAUSED, as it will charge your account as it is used.

Script Samples

Add this code to the configuration of this skill to send detected speech to the PandoraBot control.

ControlCommand("PandoraBot", SetPhrase, $BingSpeech)

Video

Here's an example of the skill in action combined with the Cognitive Vision and Cognitive Emotion services.

Requirements

The Advanced Speech Recognition skill requires paying for an API key from Microsoft. We provide another version that does not require an API key, and you can install it from here.

This service requires an internet connection, meaning a second USB WiFi adapter or an ethernet connection will be needed. Read about having two network connections here.

Resources

When this control detects a phrase, it will be assigned to the variable and the specified script will execute. With this model, you can have this skill send the detected speech to a PandoraBot skill using ControlCommand(). Here's a sample project: bingsearchtest.EZB

Control Commands for the Advanced Speech Recognition robot skill

There are Control Commands available for this robot skill which allows the skill to be controlled programmatically from scripts or other robot skills. These commands enable you to automate actions, respond to sensor inputs, and integrate the robot skill with other systems or custom interfaces. If you're new to the concept of Control Commands, we have a comprehensive manual available here that explains how to use them, provides examples to get you started and make the most of this powerful feature.

Control Command Manual

controlCommand("Advanced Speech Recognition", "StartListening") -- Begin listening for speech to text. The script will execute when completed, and the variable will be set with the result

Similar Skills

Upgrade to ARC Pro

Stay on the cutting edge of robotics with ARC Pro, guaranteeing that your robot is always ahead of the game.

Compare Pro Features View Subscription Plans

joinny

Japan

#1 Jun 2019

Let me ask how to add any language pack in the world to ez_builder?

Jlucben

France

#2 Nov 2019

Hello

Language selection is no more working , I just try to switch to fr-FR and recognition is english one,

EzAng

PRO

USA

#3 Nov 2019

Is this a new version? 11/25/2019?

Jlucben

France

#4 Nov 2019

ARC Designer Beta Version 2019.11.22.00

Ang plugin Version 24

DJ Sures

PRO

Synthiam

#5 Jan 2020

Updated (version 30) to fix errors that popped up when unpausing and pausing the listener

DJ Sures

PRO

Synthiam

#6 Mar 2020

Updated the plugin to use microsoft's latest library (1.10.0)

DJ Sures

PRO

Synthiam

#7 Feb 2021

Updated: Uses the internal ARC VAD (voice activity detection) rather than continuous speech detection. This means the API usage will be much less because only spoken (Detected) phrases are sent to the API rather than continuous recording.

TMesserschmidt

USA

#8 Nov 2022

Is this any more accurate than Bing?

Advanced Speech Recognition

How to add the Advanced Speech Recognition robot skill

Don't have a robot yet?

How to use the Advanced Speech Recognition robot skill

Main Window

Configuration

How to Use Advanced Speech Recognition Skill

Script Samples

Video

Requirements

Resources

Control Commands for the Advanced Speech Recognition robot skill

Similar Skills

Azure Speech To Text Engine

Audiotoolbox Plugin

Advanced Speech Synthesis

Related Hack Events

Robot Learn A New Object

Related Robots

Fxrtst's Introducing Alan

Related Questions

Use Voice Recognition For Unsupported Languages

Use The Old Version

Robot Skills Reach Daily Limit

Bing Speech Controlled By A Script

Unable To Configure Advanced Speech Recognition

Capturing Spontaneous Voice Input

Upgrade to ARC Pro

Products

Community

Support

About