Asked

How To I Include Pauses In Speech Synthesis.

How can I incorporate natural pauses into a speech script using Synthiam ARC? For instance, I want it to say "HO pause HO pause HO" with noticeable pauses in between.

Currently, my Santa head is saying "HO HO HO" too quickly, and I would like to slow it down for a more natural delivery.


Related Hardware EZ-B IoTiny
Related Control Advanced Speech Synthesis

ARC Pro

Upgrade to ARC Pro

Unleash your robot's full potential with the cutting-edge features and intuitive programming offered by Synthiam ARC Pro.

PRO
Canada
#1  

To incorporate natural pauses into your speech synthesis using the Advanced Speech Synthesis skill in Synthiam ARC, you can use the Speech Synthesis Markup Language (SSML). SSML allows you to add pauses, change pitch, speed, and more to create a more natural-sounding speech.

Here's how you can add pauses to your speech script:

  1. Open the Advanced Speech Synthesis Skill: In your ARC project, locate the Advanced Speech Synthesis skill and open its configuration.

  2. Use SSML for Pauses: You can use the <break> tag in SSML to add pauses. The time attribute specifies the duration of the pause. For example, to add a half-second pause, you can use <break time="500ms"/>.

  3. Example Script: Here's an example of how you can structure your script to include pauses:

    <speak>
      HO <break time="500ms"/> HO <break time="500ms"/> HO
    </speak>
    
  4. Implement the Script: Use the Audio.saySSML() function to implement the SSML script in your project. Here's how you can do it in JavaScript:

    var ssml = '<speak>HO <break time="500ms"/> HO <break time="500ms"/> HO</speak>';
    Audio.saySSML(ssml);
    
  5. Test the Speech: Run your script to test the speech output. Adjust the time attribute in the <break> tags as needed to achieve the desired pause duration.

By using SSML, you can fine-tune the speech output to sound more natural and fit your specific requirements. If you have any more questions or need further assistance, feel free to ask!


I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
PRO
Synthiam
#2   — Edited

Are you using the advanced speech recognition robot skill that has nicer voices? or the built-in one that uses the Audio.say() command?

var str = "<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>" +
 "HO <break time='500ms'/>" +
 "HO <break time='500ms'/>" +
 "HO" +
 "</speak>";

Audio.saySSML(str);

And of course, the laziest way is to just add periods to force a delay between the words haha.

Audio.say("ho... ho... ho...");

You can also read about how to add inflection and change the pitch and stuff for SSML: https://synthiam.com/Support/javascript-api/Audio/saySSML

Edit: PS if you want it to come out of the EZB, use the Audio.SaySSMLEZB() command instead

PRO
Synthiam
#3   — Edited

This is a good one. It sings... I had ChatGPT generate this for me. Thing is, I really had to tell it to exaggerate the pitch values to extreme because the original version didn't do much.

var str = "<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>" +
    "<prosody pitch='+80%' volume='loud'>Dashing</prosody> " +
    "<prosody pitch='+120%' volume='loud'>through the snow</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+150%' volume='loud'>in a one-horse open sleigh</prosody>, " +
    "<break time='500ms'/>" +
    "<prosody pitch='+100%' volume='loud'>o'er the fields</prosody> " +
    "<prosody pitch='+140%' volume='loud'>we go</prosody>, " +
    "<break time='300ms'/>" +
    "<prosody pitch='+180%' volume='loud'>laughing</prosody> " +
    "<prosody pitch='+200%' volume='loud'>all the way!</prosody> " +
    "<break time='700ms'/>" +
    "<prosody pitch='+120%' volume='loud'>Bells on bobtail ring</prosody>, " +
    "<break time='300ms'/>" +
    "<prosody pitch='+160%' volume='loud'>making spirits bright</prosody>, " +
    "<break time='300ms'/>" +
    "<prosody pitch='+140%' volume='loud'>what fun it is</prosody> " +
    "<prosody pitch='+180%' volume='loud'>to laugh and sing</prosody>, " +
    "<break time='300ms'/>" +
    "<prosody pitch='+200%' volume='loud'>a sleighing song tonight!</prosody> " +
    "<break time='1s'/>" +
    "<prosody pitch='+150%' volume='loud'>Jingle bells,</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+200%' volume='loud'>jingle bells,</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+250%' volume='loud'>jingle all the way!</prosody> " +
    "<break time='500ms'/>" +
    "<prosody pitch='+120%' volume='loud'>Oh what fun</prosody> " +
    "<prosody pitch='+160%' volume='loud'>it is to ride</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+200%' volume='loud'>in a one-horse</prosody> " +
    "<prosody pitch='+250%' volume='loud'>open sleigh!</prosody> " +
    "<break time='1s'/>" +
    "<prosody pitch='+180%' volume='loud'>Jingle bells, jingle bells,</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+250%' volume='loud'>jingle all the way!</prosody> " +
    "<break time='500ms'/>" +
    "<prosody pitch='+160%' volume='loud'>Oh what fun it is</prosody> " +
    "<prosody pitch='+200%' volume='loud'>to ride</prosody> " +
    "<break time='300ms'/>" +
    "<prosody pitch='+300%' volume='loud'>in a one-horse open sleigh!</prosody>" +
    "</speak>";

Audio.saySSML(str);