PRO
afcorson
Australia
Asked
Silence Detection For Bing Speech Recognition Skill
I thought I had requested this feature before, but I can't find any record of it. I would like the Bing Speech Recognition Skill to detect when a person stops talking and then stop recording. VAD is not that reliable and limiting the recording length is a clumsy way to stop the recording. For example if the recording limit is set to 7 secs and the user just says 'hello robot', they would be waiting several seconds before getting a response, by which time they have probably walked away.
Related Robot Skills (view all robot skills)
Bing Speech Recognition
by Microsoft
Accurate Bing cloud speech-to-text for ARC: wake-word, programmable control, $BingSpeech output, Windows language support, headset compatible
Voice Activity Detection
by Synthiam
Real-time VAD using 300-3400 Hz FFT to trigger scripts on speech start/stop, with sensitivity tuning, live audio level graph, and TTS pause.
Requires ARC
v8

The updated VAD skill now works a treat. My robot/s now knows when I have stopped talking. ChatGPT can respond much more quickly without having to wait for the seconds specified in the settings.
That's really great to hear! I knew there was a bunch of work on it, including reviewing some white papers on voice activity detection. I haven't tried it, but your feedback is good to hear!