Thumbnail

Speech Recognition

Windows Speech Recognition skill: detect custom phrases via PC mic, trigger configurable scripts/actions with adjustable confidence.

How to add the Speech Recognition robot skill

  1. Load the most recent release of ARC (Get ARC).
  2. Press the Project tab from the top menu bar in ARC.
  3. Press Add Robot Skill from the button ribbon bar in ARC.
  4. Choose the Audio category tab.
  5. Press the Speech Recognition icon to add the robot skill to your project.

Don't have a robot yet?

Follow the Getting Started Guide to build a robot and use the Speech Recognition robot skill.


How to use the Speech Recognition robot skill

The Speech Recognition robot skill uses the built-in Microsoft Windows Speech Recognition Engine to listen for known phrases using your computer’s default audio input device (microphone). Phrases are configured manually in the Settings, and each phrase can trigger a custom action using scripts and commands.

Main Window

Speech Recognition - Main Window

1. Pause Button
Pauses audio detection. While paused, the skill will not recognize phrases and no configured actions will execute.

2. Phrase List Button
Opens the phrase list configured in the Settings. This allows you to quickly review what phrases the skill is currently listening for.

Speech Recognition - Phrase List

3. Audio Waveform
Displays live audio waveform feedback to confirm your microphone is configured correctly and actively receiving sound.

4. Response Display
Shows detection and execution feedback. When a phrase is recognized, the log will display the matched phrase (and confidence), along with the action/script that was executed.

Settings

Speech Recognition - Settings

1. Confidence Drop-down
Sets the minimum confidence required for a phrase to be accepted. Phrases detected below this threshold are ignored. If your phrases are not being detected reliably, reduce the confidence setting. When a phrase is detected, the confidence value appears in brackets in the Response Display.
Note: Lower confidence thresholds increase the chance of false positives.

2. Setup Microphone Button
Opens the Windows microphone device properties dialog. Use this to verify the correct input device is selected and to confirm the audio meter responds when you speak.

3. Recognition Scripts
Scripts that run when a phrase is recognized at or above the confidence level (All Recognized), and when a phrase is detected with low confidence. These are useful for logging, diagnostics, or responding differently based on recognition quality.

4. Enable / Disable Phrase Fields
Phrases used to pause and unpause speech detection. These act as voice-controlled toggles for enabling or disabling recognition.

5. Enable / Disable Command Scripts
The scripts executed when the Enable/Disable phrases are recognized. This is commonly used to control robot behavior such as entering a “listening mode” or disabling speech during loud activities.

6. Language Drop-down
ARC uses the speech recognition capabilities built into Windows. Any language supported by Windows Speech Recognition is also supported by ARC. ARC will default to EN-US if installed, otherwise it will use the first installed language that supports speech recognition. If multiple supported languages are installed, select the desired language here.

For more information on installing additional speech recognition languages, view: https://www.tenforums.com/tutorials/120631-change-speech-recognition-language-windows-10-a.html

Here is how to add a new language pack:

  1. Go to Start and open Settings.
  2. Select Time & language > Language.
  3. Select the language you want to add speech to, then select the Next button.
  4. Select the speech options you want included with the language.
  5. Sign out and sign back in for the new speech pack to be added to speech options.
  6. Go back to Settings > Time & language > Language, select your new language, and move it to the top to make it default.
  7. Go to Speech and ensure the Speech language setting matches your new default language.
  8. Sign out and sign back in for the new settings to take effect.
  9. Select the desired language from the ARC Speech Recognition configuration menu.

7. Confidence Variable
The variable that stores the confidence value (decimal format) of the last recognized phrase.

8. Phrase Variable
The variable that stores the text of the last recognized phrase.

9. Phrase List
The list of phrases to recognize. You can customize the defaults and add additional phrases.

10. Command List
The command/action corresponding to each phrase in the same row. You can customize commands and add additional rows.

11. List Management Buttons
Buttons for managing phrase rows: move up/down, insert, append, and delete.

How to Use Speech Recognition

  1. Install, configure, and test your audio input device (see Resources below).
  2. Add the Speech Recognition skill to your ARC project: ProjectAdd SkillAudioSpeech Recognition.
    Note: This is different from Advanced Speech Recognition.
  3. In Settings, configure your phrases and the corresponding commands/scripts.
  4. Save your settings, then speak your configured phrases into the microphone to trigger the actions.

Requirements

Headset or External Mic

Headset or External Microphone

A headset or external microphone produces better results than a built-in PC/laptop microphone. It helps the recognition engine capture your voice clearly with less background noise. Laptop fans, motors, radio interference, and room echo can cause false positives (the skill recognizes an incorrect phrase). An external mic also helps prevent the speech engine from hearing the robot’s own speaker output.

Resources

Configure Audio Input Device

Configure Microphone Input Device

You may need to adjust your microphone input volume/gain. Use the Windows Volume Mixer and ensure you have selected the correct input device. Some systems have multiple microphones (for example: webcam microphone, headset microphone, Bluetooth microphone). Follow these steps:

  1. Right-click the speaker icon in the system tray.
  2. Select Open Sound Settings.
  3. In the Input section, confirm the correct microphone is selected and that the VU meter moves when you speak.
  4. Click Device Properties and adjust the volume slider. We often use a value around 78, but your setup may differ.
  5. Adjust volume so normal speech peaks near the middle of the VU meter. If the gain is too high, audio distorts and recognition quality drops.

Voice Training

You can train Windows Speech Recognition using the built-in training wizard. Open the Windows Control Panel, search for Speech Recognition, and run the training wizard to improve accuracy for your voice.

Troubleshooting

  • If you receive an error that the input device could not be opened, Windows privacy/security settings may be blocking microphone access for Synthiam ARC. Follow this guide to enable microphone access: https://synthiam.com/Support/troubleshooting/camera-audio-microphone-issues .
  • If you receive an error stating Voice Recognition was unable to start (invalid OS or missing device), verify:
    1. A microphone is set as the default recording device in Windows sound settings.
    2. A Windows language pack is installed that supports Windows Speech Recognition.

To confirm Windows Speech Recognition is working, open the built-in Windows Speech Recognition application: click the Start button and type Speech Recognition. Launch Windows Speech Recognition to verify your microphone, operating system, and language support. This Microsoft tool includes diagnostic dialogs that ARC does not provide.

Windows Speech Recognition App

Video

Related Tutorials

Related Hack Events

Related Questions


ARC Pro

Upgrade to ARC Pro

Become a Synthiam ARC Pro subscriber to unleash the power of easy and powerful robot programming

Author Avatar
PRO
USA
#1   — Edited

All is good now, always did work, just windows going it's thing, oh well

Author Avatar
PRO
Synthiam
#2  

Speech recognition uses the built-in microsoft speech recognition engine. There is nothing synthiam can do to enhance or change the engine, as it's not opensource and owned by Microsoft. Follow the instructions above

Author Avatar
PRO
USA
#3  

All is good now

just did the old windows reboot, now works

Author Avatar
Australia
#4   — Edited

Hi there,

Sometimes when I open ARC, my speech recognition box doesn't detect the soundwaves. It shows the soundwave box as grey and not the red and green lines, even when I have connected the program with my microphone.

The person I am working with came to the conclusion that it might be because I didn't start with the JD Bare file when I opened the project. He has been starting with the JD Bare project every time we open the program and merging it with the previously saved program, rather than opening the saved program directly because of this issue with the speech recognition. But I thought it was inconvenient to do that and that surely I should be able to open previously saved projects and have them work.

I have run out of ideas. Any suggestions on how to fix this are much appreciated.

Author Avatar
PRO
Synthiam
#5  

Can you verify the speech recognition skill that you’re using is the bing speech recognition? Or just speech recognition?

Author Avatar
Australia
#6  

General speech recognition, not bing.

Author Avatar
PRO
Synthiam
#7  

Okay - I’ll see if I can reproduce it. Stay tuned

Author Avatar
Australia
#8  

Thanks, DJ:)

This is what the Speech Recognition window looks like for me sometimes when I open my previously saved programs. Notice how the soundwaves cannot be seen up the top right? I'm baffled!

User-inserted image