Hello y'all,
I've created a C# project and I'm already able to do basically the same as the Bing Speech Recognition plugin does, but I also need to make my EZ-B (JD) "speaks" in another language. I've installed a Microsoft voice, but that's a pretty bad one, and the Azure platform offers two very nice voices to be used.
Right now I'm stuck with how I send the voice received from Azure to the EZ-B. Azure offers a variety of audio formats.
Has anyone tried this before? I've gone through some of the tutorials in the SDK but couldn't find one that does something like that.
Thanks!
Gilvan
Asked
— Edited
Check the UniversalBot code
http://synthiam.com/Products/ARC
Browsing the code you have information needed to send the sound data to EZB.
I've done that before, but i can't find the code.
@DJ ?
The api you mentioned
javascript example code:
Code:
is only supported within the browser (not all) although chrome handles pretty well.
some complains:
http://ejb.github.io/2015/06/07/html5-speech-synthesis-api.html
even if you manage to launch the chrome engine (v8) like ARC does with blocky editor, you don't have a way to extract the voice sound.
https://stackoverflow.com/questions/21905583/record-html5-speechsynthesisutterance-generated-speech-to-file
Still a neat idea for the web...
This is a working example of a web based client! Which can also send data over to ARC, but cannot be called from within ARC!
http://www.downtown-tattoo.de/robotics/test123.html
But the limitation is clearly that ARC cannot send data to the browser I guess?
At least I did not find a solution on this!
I also created a plugin with complete example and source code: https://synthiam.com/redirect/legacy?table=plugin&id=202
Nice!
Do you have plans to add the Microsoft TTS Bing TTS feature to the existent plugin i.e. (EZ-Robot Bing plugin) ?
*** EDITED ***
Code:
Not that.
This:
https://docs.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/bingvoiceoutput
you are mixing up concepts.
I checked the Javascript code and runs on the browser the speech recognition and tts, after the speech recognition is done the browser calls api.ai.
All the work is done on the browser, the server only serves the page.
XMLHttpRequest is a javascript class to GET or POST data to a web server.
you have a similar function in EZ-script:
Code:
There is no limitation: a browser is a client application, you want another client application i.e. ARC to call a client ?
off course there are some exceptions i remember some years ago i used windows DDE (old stuff)
https://en.wikipedia.org/wiki/Dynamic_Data_Exchange
to interact with Internet Explorer, and Microsoft Excel, etc.
To allow other clients to connect to an Application the application needs to expose a protocol, interfaces, methods etc.
Here is information in step-by-step for recap in case you haven't viewed the links i provided. Have fun!
Source Code:
OutputAudioFromEZ-BSource.zip
*Dependency: Additional to adding ARC.exe and EZ-B.DLL, this plugin requires NAudio.DLL library to be added as a project reference. Remember to UNSELECT "Copy Files"!
This plugin provides the following examples:
1) Load audio from MP3 or WAV file
Code:
2) Convert audio file to uncompressed PCM data to supported EZ-B sample rate and sample size
Code:
3) Compress PCM data with gzip to be stored in project STORAGE
Code:
4) Play audio data from compressed project STORAGE
Code:
5) Supports ControlCommand() for Play and Stop of audio to be used in external scripts
Code:
6) Changes the status of the button when audio is playing globally from anywhere in ARC on EZ-B #0
Code:
Output Text to Speech
You can output text to speech easily as well, using the following code example...
Code:
The code is clear, and i believe it helps/answers the initial request (Gilvan Gomes).
I believe Gilvan is trying to code a plugin to use Bing/Azure/Microsoft Cognitive TTS services with EZB.
To avoid overlapping features (User vs EZ-Robot) plugin.
I asked if you have plans to create a Bing TTS plugin or extend the existent Bing Recognition plugin.
I don't think it belongs in the Bing Recognition plugin. It would be a plugin of its own. The bing recognition is for speech recognition, not speech synthesis. The configuration of the two are quite different from the user perspective.
Correct, i only asked because is inline with your Microsoft effort/partnership.
Correct, i assumed that mainly due to the name and if sharing the same keys.
If there is anything that can be done to have a variety of voices/languages available it would be highly appreciated!
In my research project, JD will work with blind people, that's why speech is so important, and the Microsoft Cognitive Services APIs have all need to complement JD's features. Once I'm done with the project itself, I'm going to publish the plugin (if no one does it until then).
Thanks again!
Gilvan
Also, if you happen to create a video or media regarding the research project, I'll be sure to have it added to a news letter. The ez-robot newsletter reaches pretty high profile people - i'm certain you will get quality viewers.
Thank you!
Gilvan