Kinect Xbox 360 Depth Mapping icon Kinect Xbox 360 Depth Mapping Converts Xbox 360 Kinect depth frames into NMS obstacle scans for ARC path planning, SLAM, and obstacle avoidance. Try it →
Asked

Why Does Speech Recognition Pickup Background Nois…

One thing I still don't understand: whenever I have the "Speak Rec" robot skill operating, it picks up background noise and causes some of my Auto Position stills to activate on their own. I do use start/stop voice-activation scripts in my ARC project, but as long as Speak Rec is listening, if it hears a noise things start moving on their own and sometimes end up breaking the robot. Why does this happen?

For context: I'm using the Speak Rec robot skill and Auto Position presets in an ARC project and I'm using scripts to start and stop voice activation. I'm not sure whether my microphone is connected to the PC (USB/audio device) or to an EZB - if that matters, please let me know what detail would be helpful.



Related Hardware (view all EZB hardware)
EZ-B V4 by EZ-Robot
EZ-B v4/2 robot controller: dual Cortex ARM, Wi-Fi, audio/video, 24 servo/digital ports, I2C/UART, camera and Synthiam ARC control for custom robots
Wi-Fi / USB
Servos 24
Camera
Audio
UART 3
I2C
ADC 8
Digital 24

Related Robot Skills (view all robot skills)
Script by Synthiam
Multi-language ARC Script: build, run and debug Blockly, JavaScript, EZ-Script or Python with Intellisense, run/save/load and Roboscratch support
Accurate Bing cloud speech-to-text for ARC: wake-word, programmable control, $BingSpeech output, Windows language support, headset compatible

ARC Pro

Upgrade to ARC Pro

Become a Synthiam ARC Pro subscriber to unleash the power of easy and powerful robot programming

Author Avatar
PRO
Canada
Twitter YouTube GitHub
#1  

This usually happens because the Speech Recognition skill is hearing ambient noise as if it were a valid phrase, and then it runs the command tied to that phrase. If one of those phrases is mapped to an Auto Position action or frame, ARC will execute it immediately.

A few important points:

Why it happens

  • Speech Recognition is always listening when enabled.
  • It does not know the difference between your voice and other sounds unless Windows speech recognition is confident enough and your phrase list is set up well.
  • If the noise is similar to a configured phrase, or if your confidence threshold is too low, it can trigger a command.
  • If that command activates an Auto Position frame/action, the robot can move unexpectedly.

What to check first

1) Confidence setting

Open the Speech Recognition skill’s kebab/hamburger menu and go to Config.
Look for the Confidence setting.

  • Try increasing the confidence threshold so only clearer matches are accepted.
  • If you already lowered it to improve recognition, that can make false triggers more likely.

2) Phrase list

Review the phrases in the skill.

  • Remove any phrases that sound too similar to common background sounds or to each other.
  • Avoid very short phrases like "go", "stop", "left", etc. if the environment is noisy.
  • Use more unique phrases like "robot move left" instead of just "left".

3) Microphone placement

This matters a lot.

  • A PC microphone often works better for voice recognition if it’s placed near you and away from the robot.
  • If the microphone is on the robot/EZ-B, it may hear the robot’s own motors, servos, speaker, or room noise.
  • Background noise and servo sound can easily cause false triggers.

If you’re using an EZ-B speaker for responses, that can also cause the microphone to hear ARC talking back and accidentally re-trigger commands.

Best practices to stop unwanted Auto Position triggers

Use a "listening mode"

Instead of leaving Speech Recognition fully active all the time, use a script to:

  • enable it only when you want to speak
  • disable it after the command is recognized

That way the robot is not always listening to ambient noise.

Make your voice commands more specific

For example:

  • "robot wave"
  • "robot sit down"
  • "robot stop moving"

instead of:

  • "wave"
  • "sit"
  • "stop"

Separate movement commands from speech commands

If possible, keep movement-related actions behind a second confirmation or a safer trigger.

For example:

  • Voice command 1: "robot prepare"
  • Voice command 2: "execute"

That can reduce accidental movement.

Where to configure it in ARC

In the Speech Recognition skill:

  1. Open the skill’s three-line/hamburger menu
  2. Choose Config
  3. Adjust:
    • Confidence
    • phrase list
    • any start/stop scripts
  4. Save the config

Microphone detail that would help

Yes, this detail matters:

  • Is the microphone a USB mic connected to the PC?
  • Or is it coming from an EZ-B audio input / speaker setup?
  • Is ARC using Windows Speech Recognition on the PC mic, or are you using a speech skill that routes through the EZ-B?

If you tell me:

  1. which Speech Recognition skill you’re using exactly,
  2. where the mic is connected, and
  3. one or two example phrases that are triggering Auto Position,

I can help you tune it so it stops reacting to background noise.

If you'd like, I can also help you redesign the script so the robot only listens during a short "voice window" and ignores everything else.


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
#2  

I am using the Bing Speech Recognition skill

The microphone is on the Laptop I use

The microphone sensitivity is usually set around 70

I do use a wake word before I speak a command

Simple commands like "Robot move forward", Raise right arm", "Open gripper" can get activated while the Speech Rec is still open.

I do already use a word to stop the Speech Rec or I sometimes will click on the Pause icon

Author Avatar
PRO
Canada
Twitter YouTube GitHub
#3  

That helps a lot - since you’re using Bing Speech Recognition with a laptop microphone, the most likely cause is false recognition from ambient noise or robot sound being interpreted as a command.

What’s probably happening

Bing Speech Recognition is cloud-based and generally accurate, but it still needs a clean audio signal. A laptop mic at 70% sensitivity can easily pick up:

  • room noise
  • servo/motor noise
  • the robot speaker
  • your voice plus background sounds

If the skill is still listening, even briefly, it can recognize something close enough to:

  • robot move forward
  • raise right arm
  • open gripper

and then ARC runs the Auto Position action attached to that phrase.

What I’d change first

1) Lower the microphone sensitivity

A setting of 70 is fairly high for a robot environment.

Try:

  • 40-55 as a starting point
  • then test again

If you’re too far away from the mic, move closer instead of increasing gain.

2) Use push-to-talk or very short listening windows

Since you already use a wake word, the next improvement is to make listening very brief:

  • wake word
  • say command
  • immediately stop listening

That reduces the time ARC is exposed to background noise.

If you’re currently leaving it open after the wake word, that’s the window where false triggers happen.

3) Increase the wake word confidence

In the Bing Speech Recognition skill settings, raise the Min Wake Word Confidence value a bit.

Example:

  • if it’s at 0.75, try 0.80 or 0.85

That makes accidental wake triggers less likely.

4) Use more unique phrases

Your commands are understandable, but some are still a bit generic.

For example:

  • robot move forward
  • raise right arm
  • open gripper

These are okay, but if the mic is hearing partial words or noisy speech, you can make them even more distinct:

  • robot advance forward
  • robot raise right arm
  • robot open gripper now

Short, unique commands tend to work better than very common words.

5) If possible, move the mic away from the robot

A laptop mic is often better than a mic mounted on the robot, but if the laptop is near the robot’s speaker or motors, it can still hear them.

Try:

  • the laptop a little farther from the robot
  • lower speaker volume
  • keep the robot motors still while speaking if possible

About the Pause icon and stop word

That’s good practice, but the issue is usually during the time it is active.

If the skill remains unpaused long enough for background noise to be interpreted, the commands can still fire.

Best-practice setup

For the most reliable behavior:

  • Wake word
  • single command
  • stop listening immediately
  • keep mic sensitivity moderate
  • use a headset or external USB mic if you can

A headset mic is often much better than a laptop mic for robots because it hears you, not the room.

ARC tip

For Bing Speech Recognition, you can review the skill’s options from the kebab menu / three-line menu and tune:

  • wake word confidence
  • phrase list
  • listening behavior
  • stop punctuation
  • microphone selection in Windows

Here’s the skill page for reference:

My recommendation for your setup

Based on what you described, I’d try this order:

  1. Drop mic sensitivity from 70 to around 45-50
  2. Increase wake word confidence slightly
  3. Use a shorter listen window after the wake word
  4. Keep commands longer/more unique
  5. If possible, test with a headset mic

If you want, I can help you set up a safer voice-control flow in ARC so the robot only accepts commands for a few seconds after the wake word, then automatically disables listening.


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
#4  

@Robohappy, I feel your pain. I've feared exactly this happening. I have robot arms that are inside the robot and need to extend out before they start moving. You can only imagine what would happen if they started moving around while inside the robot. It's happened and was not a pretty sight. I since have implemented a few failsafe measures to keep this from happening and if it does to keep any damage from happening. Here's what I did:

Thankfully the motor that moves the arms up and down is a DC motor and is powered by an "intelligent" motor controller. The motor controller is a Sabertooth with a Kangaroo daughter board. This controller can sense a runaway or stall motor event. This has saved my butt a number of times.

I moved from using the microphone on the computer to a nice one that is plugged into it. Neither worked well. I was getting all sorts of false positive script triggers or just plain no response when I spoke. I ended up switching to a high quality headphone mic with noise canceling. I tried several and ended up loving the Plantronics - Voyager 5200 UC. It can be turned off and on easily (it will quickly reconnect to the computer) and quickly if you want to have complete disconnect from the computer. It comes with a bluetooth USB dongle and is made to be used with a computer via Usb. Now I have at least 98% positive recognition with no false triggers. It's a bit pricey but worth every penny in my opinion. The downside is that I'm the only person that can command the robot with SR.

Like you I use a wake word. It's a simple word that I don't use much in conversation. I use "Robot" I also have a visual set of lights that are programmed to come on for the duration of the "listening" period I have set. That way I have a visual of when to start speaking and for how long. I also have the Wake word confidence set pretty high in Bing SR skill that helps to stop the false triggers.

Another fail safe I use it to set a variable for moving parts that may get damaged if they move at the wrong time. This also helps with operator error. for example, I'll have a global var set to 0 called "$botharmsextended" that's set in ARC's int script at start up (because the arms are always inside the robot at start up). After the arms are commanded to extend, that script will check that the arms are indeed out of the robot (by checking both position and a triggered microswitch). If the script can confirm that the arms are indeed extended it will change the var $botharmsextended to a 1 meaning that it's safe to move the arms. If not the var stays at 0. Now the trick is to have other scripts related to the arm check this global var to make sure they are safe to move. I use this method on other moving parts that may be damaged if they move before I'm ready. This method stops any damage but also adds a layer of complexity to your scripts.

I hope this helps somehow. Good luck with your issue.

Author Avatar
PRO
Canada
Twitter YouTube GitHub
#5  

That’s a solid set of safeguards, and it lines up really well with what I’d recommend in ARC.

What Dave described is basically the right mindset for a robot that can physically damage itself: don’t rely on speech recognition alone as the safety gate. Use speech as the trigger, but let ARC verify whether motion is allowed.

The best takeaway

For risky motions like:

  • extending arms
  • opening grippers
  • moving servos inside a body shell

you should add a safety condition before the action runs.

Good pattern in ARC

Instead of letting a speech phrase directly call an Auto Position action, do this:

  1. Speech Recognition hears the phrase
  2. It sets a variable like $botharmsextended
  3. A script checks that variable
  4. Only then does ARC run the Auto Position action

That way, even if Bing Speech mishears something, the robot won’t move unless the state is safe.

Example approach

If you use a global variable like:

  • $botharmsextended = 0 means unsafe
  • $botharmsextended = 1 means safe

then your voice command script can check it first.

JavaScript example

if (getVar("$botharmsextended", "0") == "1") {
  controlCommand("Auto Position", "AutoPositionFrame", "OPEN GRIPPER")
} else {
  print("Blocked: arms are not extended yet.")
}

If you want, you can also make the robot speak a warning:

if (getVar("$botharmsextended", "0") == "1") {
  controlCommand("Auto Position", "AutoPositionFrame", "OPEN GRIPPER")
} else {
  Audio.say("Blocked. Arms are not extended.");
}

Even better: use a state-check script

For each risky command, wrap it in a simple "permission check" script. That gives you:

  • safer operation
  • fewer accidental movements
  • easier debugging
  • a clear place to add more conditions later

For example, you could also check:

  • $IsSpeaking
  • whether the robot is already moving
  • whether a gripper limit switch is active
  • whether the arm has fully extended

About the microphone side

Dave’s mic advice is excellent too. A noise-canceling headset mic or a bluetooth headset like the Voyager 5200 UC can make a huge difference compared with a laptop mic.

For your current setup, I’d suggest:

  • lower mic sensitivity a bit
  • raise wake word confidence
  • use more distinct phrases
  • add a safety variable before every dangerous motion

ARC-specific suggestion

You can put these checks right in the All Recognized Script of Bing Speech Recognition, or in the script that gets called by the phrase.

If you haven’t already, open the Bing Speech skill’s kebab menu / three-line menu, go to Config, and use:

  • Phrase List
  • All Recognized Script
  • Start Listening Script
  • Not Handled Script

That gives you a clean place to filter commands before anything moves.

My recommendation

For your robot, I would treat speech recognition as an input, not a permission system.

So:

  • Speech says: "Open gripper"
  • Script checks: "Is it safe?"
  • Only then Auto Position runs

That’s the safest way to prevent surprise motions.

If you want, I can help you write a single ARC JavaScript safety wrapper for your Auto Position commands so all your voice commands go through the same protection logic.


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
#6  

@Athena, Great suggestions, I will look at my settings to start with then see where it all stand and get back to you

@Dave Thanks for the external mic suggestion, I have been thinking about something like that for some time, so I think I will look further into now:)

Author Avatar
PRO
Canada
Twitter YouTube GitHub
#7  

That sounds like a good plan.

Start with the Bing Speech Recognition settings first, then test with a few commands while keeping the robot in a safe pose. If you find the false triggers still happen, a noise-canceling headset or external mic will likely help a lot more than the laptop mic.

A few things I’d check in ARC:

  • Wake word confidence: raise it a bit
  • Microphone sensitivity in Windows: lower it from 70 if possible
  • Phrase list: make commands more distinct
  • Listening window: stop listening as soon as the command is spoken
  • Safety variable: use a global variable like $botharmsextended before allowing risky Auto Position actions

If you want to make the safety part more robust, I can help you build a small JavaScript wrapper in ARC so all voice commands check a safety condition before running Auto Position.

Useful links:

If you’d like, send me a screenshot of your Bing Speech Recognition config and I can point out exactly what to adjust.


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
#8  

One thing I mistaking, I generally use the regular Speech recognition skill more for the typical robot commands more than the Bing one. This now brings a new question. Stick with the regular Speech recognition skill or is it preferred to use the Bing Speech rec skill as of late?