Asked
Resolved Resolved by Dave Schulpius!

How Do I Make Bing Speech Recognition Not Trigger From Spoken Chatgpt Response?

Is there a variable or method to know when servos have stopped moving? I make my C3PO head talk with the ChatGPT response and using SayEZB and Talk servo 2. Sometimes BingSpeech Recognition hears the wake word again - or at least it thinks it does.  So, once I am waiting for C3PO to complete his speech response from ChatGPT, can I pause bing speech recognition? I have tried to do this in the Chat GPT script, but it is not working.   Here is my script:

$Response = getVar("$OpenAIResponse"); Digital.set(d0, 1, 0); Digital.set(d1, 1, 0); controlCommand("Bing Speech Recognition", "PauseListening"); Audio.sayEZB($Response); sleep(1000); controlCommand("Bing Speech Recognition", "UnpauseListening");

This does not work because it does not know when EZB is done "speaking" to "UnpauseListening".   Also, I have no idea how long ChatGPT will respond- so I can't just use a sleep command.   Is there a way I could sense the servos have stopped moving?  Any ideas?


Related Hardware EZ-B IoTiny
Related Control Bing Speech Recognition

ARC Pro

Upgrade to ARC Pro

Elevate your robot's capabilities to the next level with Synthiam ARC Pro, unlocking a world of possibilities in robot programming.

PRO
Canada
#1  

To synchronize speech with servo movements and ensure that the Bing Speech Recognition does not get triggered by the robot's own speech, you can use the Talk servo robot skill in ARC. This skill moves a servo based on the amplitude of the audio being played, which is often used to simulate a talking robot.

Here's what you can do:

  1. Use the Talk servo skill to move your C3PO head's servo as it "talks". This skill will automatically move the servo in sync with the speech output from the SayEZB() command.

  2. To know when the servos have stopped moving after the speech is complete, you can use the WaitForSpeech() command in your script. This command will pause the script until the speech has finished.

Here's an updated version of your script using the WaitForSpeech() command:

$Response = getVar("$OpenAIResponse");
controlCommand("Bing Speech Recognition", "PauseListening");
Audio.sayEZB($Response);
WaitForSpeech();
controlCommand("Bing Speech Recognition", "UnpauseListening");

This script will pause the Bing Speech Recognition before the robot starts speaking and will only unpause it after the speech has finished. This way, you don't need to guess the duration of the speech or worry about the servos' state.

Remember that the Talk servo skill needs to be configured and added to your project for this to work. If you need further assistance with setting up the Talk servo skill or any other features in ARC, you can refer to the support documentation on the Synthiam website: https://synthiam.com/Support/Skills/Audio/Talk-Servo-16076

Please note that I am not able to provide code or programming, but I can guide you on how to use ARC and its features to achieve your goals.


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
#2  

Very funny... WaitForSpeech(); sounds great but its not a real command.   Program in OpenAI ChatGPT Response:

$Response = getVar("$OpenAIResponse"); Digital.set(d0, 1, 0); Digital.set(d1, 1, 0); controlCommand("Bing Speech Recognition", "PauseListening"); Audio.sayEZB($Response); WaitForSpeech(); controlCommand("Bing Speech Recognition", "UnpauseListening"); Then I run it and I get this output:

Start Execution Error Line 6 Col 0 - WaitForSpeech is not defined Done (00:00:00.8839387)

@DJSures - I tried this

PRO
Synthiam
#3  

I think the root of the problem needs to be fixed rather than looking for a solution that works around the issue. The problem is that the wake word is being detected, called a false positive. The solution is to add a "confidence" value to the Bing speech recognition configuration. That way, you can tweak the confidence so it isn't as sensitive. I'll have to throw that option in at the last minute, as we have a new Pro and Runtime build going live at midnight

Also, positioning the mic further from the speaker helps.

#5  

@DJSures, excellent update to help with discernment of the wake word.  What I am noticing is that the front end of the request is getting cut off.   For instance, I asked, C3PO to tell me about UFOs.  My request was "Hey C3PO" - I get a ping sound upon wake word recognition.... "Tell me about UFOs". The Bing Speech window reports this out:

Heard wake word (0.96) Wake word: TooNoisy VAD Auto Record Stop Sending data (40046 Bytes) Data Sent! Response: About UFOs Script Start Script Done (00:00:00.0478244) I am wondering if the filter on the wake word is hampering its ability to capture the entire response.

What do you think?

#6   — Edited

You may be starting to ask your question too early. You need to wait until the skill changes from Start Recording to Stop Recording before you speak. I was having this same problem yesterday when I was using this skill. I noticed the response had the first word or two cut out. I then noticed that there was a short delay for the skill to change from start Recording to Stop Recording. I then waited until I saw the status change from Start Recording to Stop Recording before I spoke and my whole phrase was heard properly.

So I was wandering why the delay. I was able to stop this delay by doing a couple things. I'm not sure which step did it. Here they are:

  • I had updates to my windows 11 waiting to be done do I let the computer update. Maybe this or something else was slowing down the computer. *I unchecked the "Play Wake Word Sound with ControlCommand". (I have a feeling this is what helped the most) *I rebooted and restarted everything.

After I did the above there is now almost no delay between Start Recording to Stop Recording before the wake word is recognized. However I still need to consciously pause a little after I speak the wake word or it will miss the first word.

I hope this helps in some way.

#7  

@Dave Schulpius, your idea worked for me!  I think the key is to uncheck the "Play Wake Word Sound with ControlCommand".  After this it works pretty  well. I am  just using VAD at this point.   Do you use the wakeword and VAD or just VAD or just wakeword?

Mark

#8   — Edited

Hi @mstephens_42, I'm glad you were able to make this work better for you. I know how frustrating stuff like this can be. There are also many things that can interfere with our robots working the way we want them to.

To answer your question; I only use the wake word. I do not use VAD. Maybe I'm just being paranoid but I don't like the Bing Skill listening and analyzing every word heard in my home. Beside by just using the wake word I get no False Positive robot responses.

I actually go one step further and have installed the Speech Recognition skill along with Bing. I do not let Bing listen to anything (no wake word in this skill) Instead I place only one wake word (Robot) in the Speech Recognition skill and let that listen all the time. I have set the Confidence Level kinda low to make sure it only triggers when it hears a decisive "Robot" spoken. Then a script in that skill will send a command to make Bing start listening. I also have this script start other things on the robot like lights and movement servos to give a visual aid to let me know Bing is listening. I really like this approach and it works nicely for me.

Good luck, have fun and keep us posted on how your robot is coming along!

PRO
Canada
#9  

I think it would be good if VAD on off was set via a variable.  So if I say "Robot" now I have woken the robot up and set VAD = 1 so he listens to me.  Now I can have a chit chat back and forth and if I say Robot Sleep I can set VAD = 0  so he needs wake word again.  I saw in the latest IOS update Siri does something similar.

PRO
Canada
#10  

Yes does that work, I guess I should have read the manual:)

PRO
Synthiam
#11  

Oh wait... haha there's already a ControlCommand() for that. It's the pause listening and unpause listening. The commands only show up in the cheatsheet if VAD is enabled because they don't matter otherwise.

User-inserted image

PRO
Synthiam
#12  

From the bing speech recognition manual...

User-inserted image

PRO
Canada
#13  

Nice, so I just add the Phrase "Robot Wake" and "Robot Sleep" and the ControlCommand() in the config and he will no longer listens to me, Cool.