Microsoft Cognitive Speech - Text To Speech - Questions - Community

Gilvan Gomes

Brazil

Asked Jun 2017 — Edited Jun 2017

Microsoft Cognitive Speech - Text To Speech

Hello y'all, I've created a C# project and I'm already able to do basically the same as the Bing Speech Recognition plugin does, but I also need to make my EZ-B (JD) "speaks" in another language. I've installed a Microsoft voice, but that's a pretty bad one, and the Azure platform offers two very nice voices to be used.

Right now I'm stuck with how I send the voice received from Azure to the EZ-B. Azure offers a variety of audio formats.

Has anyone tried this before? I've gone through some of the tutorials in the SDK but couldn't find one that does something like that.

Thanks! Gilvan

Jump to end

Upgrade to ARC Pro

Join the ARC Pro community and gain access to a wealth of resources and support, ensuring your robot's success.

Compare Pro Features View Subscription Plans

DJ Sures

PRO

Synthiam

#9 Jun 2017

I updated the plugin tutorial to include instructions on how to output audio: https://synthiam.com/Tutorials/UserTutorials/146/24

I also created a plugin with complete example and source code: https://synthiam.com/redirect/legacy?table=plugin&id=202

ptp

PRO

USA

#10 Jun 2017

DJ,

Nice!

Do you have plans to add the Microsoft TTS Bing TTS feature to the existent plugin i.e. (EZ-Robot Bing plugin) ?

*** EDITED ***

DJ Sures

PRO

Synthiam

#11 Jun 2017

Microsoft TTS? As in the speech synthesis? It has an output stream... Just do this..


      using (MemoryStream s = EZBManager.EZBs[0].SpeechSynth.SayToStream(&quot;I am speaking out of the EZ-B))
        EZBManager.EZBs[0].SoundV4.PlayData(s);

ptp

PRO

USA

#12 Jun 2017

@DJ,

Not that.

This: https://docs.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/bingvoiceoutput

ptp

PRO

USA

#13 Jun 2017

@Mickey666Maus,

you are mixing up concepts.

Quote:
http://www.downtown-tattoo.de/robotics/test123.html

I checked the Javascript code and runs on the browser the speech recognition and tts, after the speech recognition is done the browser calls api.ai.

All the work is done on the browser, the server only serves the page.

Quote:
To connect to the ARCs server you would just have to make an XMLHttpRequest();

XMLHttpRequest is a javascript class to GET or POST data to a web server.

you have a similar function in EZ-script:


HttpGet( url )

Quote:
But the limitation is clearly that ARC cannot send data to the browser I guess?

There is no limitation: a browser is a client application, you want another client application i.e. ARC to call a client ?

off course there are some exceptions i remember some years ago i used windows DDE (old stuff) https://en.wikipedia.org/wiki/Dynamic_Data_Exchange

to interact with Internet Explorer, and Microsoft Excel, etc.

To allow other clients to connect to an Application the application needs to expose a protocol, interfaces, methods etc.

DJ Sures

PRO

Synthiam

#14 Jun 2017

@ptp, the example i provided will give you the ability to pipe any audio through the ez-b. Simply pipe PCM data at the sample rate specified in the example into any of the PlayData() overrides. It doesn't matter where you get the data from. If you get it from the link provided, cool. If you get it from a microphone, cool. If you get it from summoning a spirit ghost in the form of compatible audio data, cool.

Here is information in step-by-step for recap in case you haven't viewed the links i provided. Have fun!

Source Code: OutputAudioFromEZ-BSource.zip

*Dependency: Additional to adding ARC.exe and EZ-B.DLL, this plugin requires NAudio.DLL library to be added as a project reference. Remember to UNSELECT "Copy Files"!

This plugin provides the following examples:

Load audio from MP3 or WAV file


// MP3
NAudio.Wave.Mp3FileReader mp3 = new NAudio.Wave.Mp3FileReader(openFileDialog1.FileName);

// WAV
NAudio.Wave.WaveStream wav = new NAudio.Wave.WaveFileReader(openFileDialog1.FileName);

Convert audio file to uncompressed PCM data to supported EZ-B sample rate and sample size


NAudio.Wave.WaveFormatConversionStream pcm = new NAudio.Wave.WaveFormatConversionStream(new NAudio.Wave.WaveFormat(EZ_B.EZBv4Sound.AUDIO_SAMPLE_BITRATE, 8, 1), mp3);

Compress PCM data with gzip to be stored in project STORAGE


                using (MemoryStream ms = new MemoryStream()) {

                  using (GZipStream gz = new GZipStream(ms, CompressionMode.Compress))
                    pcm.CopyTo(gz);

                  _cf.STORAGE[ConfigTitles.COMPRESSED_AUDIO_DATA] = ms.ToArray();
                }

Play audio data from compressed project STORAGE


        using (MemoryStream ms = new MemoryStream(compressedAudioData))
        using (GZipStream gz = new GZipStream(ms, CompressionMode.Decompress))
          EZBManager.EZBs[0].SoundV4.PlayData(gz);

Supports ControlCommand() for Play and Stop of audio to be used in external scripts


    public override object[] GetSupportedControlCommands() {

      List items = new List();

      items.Add(ControlCommands.StartPlayingAudio);
      items.Add(ControlCommands.StopPlayingAudio);

      return items.ToArray();
    }

    public override void SendCommand(string windowCommand, params string[] values) {

      if (windowCommand.Equals(ControlCommands.StartPlayingAudio, StringComparison.InvariantCultureIgnoreCase))
        playStoredAudio();
      else if (windowCommand.Equals(ControlCommands.StopPlayingAudio, StringComparison.InvariantCultureIgnoreCase))
        stopPlaying();
      else
        base.SendCommand(windowCommand, values);
    }

Changes the status of the button when audio is playing globally from anywhere in ARC on EZ-B #0


    public FormMain() {

      InitializeComponent();

      EZBManager.EZBs[0].SoundV4.OnStartPlaying += SoundV4_OnStartPlaying;
      EZBManager.EZBs[0].SoundV4.OnStopPlaying += SoundV4_OnStopPlaying;
    }

    private void FormMain_FormClosing(object sender, FormClosingEventArgs e) {

      EZBManager.EZBs[0].SoundV4.OnStartPlaying -= SoundV4_OnStartPlaying;
      EZBManager.EZBs[0].SoundV4.OnStopPlaying -= SoundV4_OnStopPlaying;
    }

    private void SoundV4_OnStopPlaying() {

      Invokers.SetText(btnPlayAudio, &quot;Play&quot;);
    }

    private void SoundV4_OnStartPlaying() {

      Invokers.SetText(btnPlayAudio, &quot;Stop&quot;);
    }

Output Text to Speech You can output text to speech easily as well, using the following code example...


      using (MemoryStream s = EZBManager.EZBs[0].SpeechSynth.SayToStream(&quot;I am speaking out of the EZ-B))
        EZBManager.EZBs[0].SoundV4.PlayData(s);

ptp

PRO

USA

#15 Jun 2017

@DJ,

The code is clear, and i believe it helps/answers the initial request (Gilvan Gomes).

I believe Gilvan is trying to code a plugin to use Bing/Azure/Microsoft Cognitive TTS services with EZB.

To avoid overlapping features (User vs EZ-Robot) plugin.

I asked if you have plans to create a Bing TTS plugin or extend the existent Bing Recognition plugin.

DJ Sures

PRO

Synthiam

#16 Jun 2017

Not immediate plans - looks pretty simple though. Specifically since the service returns a byte array of compatible audio. Simply pump it through the examples and voila, you got bing tts.

I don't think it belongs in the Bing Recognition plugin. It would be a plugin of its own. The bing recognition is for speech recognition, not speech synthesis. The configuration of the two are quite different from the user perspective.

Gilvan Gomes

Microsoft Cognitive Speech - Text To Speech

Upgrade to ARC Pro

Quote:

Quote:

Quote:

Products

Community

Support

About