Asked
— Edited
The IBM Watson Services plugin created by @PTP currently allows you to perform Speech to Text and Text to Speech as well as visual recognition using Watson Services.
You can download and install the plugin here https://www.ez-robot.com/EZ-Builder/Plugins/view/251
Before you can use the plugin, you will need to apply for a free 30 day trial of IBM Speech to Text Account here https://www.ibm.com/watson/services/speech-to-text and a free 30 day trial of IBM Text to Speech https://www.ibm.com/watson/services/text-to-speech/
I will create some examples as it moves forward and try to answer any how to questions.
Thanks for creating this PTP, as with all your plugins this is an excellent piece of work and showcase of your talents.
Here is a quick watson assist project I started writing. Sorry learning both EZ-Script and Watson assist so it is really rough but will give everyone an idea. I gave him a bad attitude because he never listens to me. I need to go back and do a lot more work to record what position he is currently in etc and take multiple instructions but if others want to work on together would be great. This is for JD robot.
call from speech to text plugin before speech to text
after speech to text
Put in Watson Assist Script in the plugin.
import as a robotest.json file to watson assist
Regarding the microphone subject (quality)
Anyone tested the Kinect Microphone Array ?
It can't be added to a small robot... and soon later they will stop selling...
What model kinect do you have? I have an old kinect 360 missing a power supply I can hack something together and test. Hopefully the openkinect or Primesense Mate NI drivers still work with windows 10. I use a PS3 eye on my desktop and works really well with voice recognition (great mic) although it would be good to get a kinect working especially if we get a unity plugin.
I have a kinect 2, but I'm having trouble getting the microphone array to pick up sound in windows. I'll look into today and test the recognition if I can get it to work.
Kinect: I'm curious how good is the Kinect in a noisy/open environment.
@Nink: I have: PS3Eye, Asus Xtion, Kinect 1, Kinect 2. All of them have microphone arrays, and kinect has a sound localization api. One can use to turn the robot head towards the sound/person.
Does the PS3Eye work for long distances and/or environment noise ?
I'm evaluating a few microphone arrays, and one of the cheapest solutions is to use the PS3Eye with a Raspberry PI Zero and forward the sound to the PC (wifi microphone)
Post #19 https://synthiam.com/Community/Questions/10781&page=2
Not sure about background noise but PS3 eye is good from a distance (nothing works well with background noise). I think we need to just do a "Launch Name" (like "hey siri", "OK Google" or "echo" etc) using ezb voice rec and hope for the best.
This doesn't solve depth sense issue though.
I am happy to go for a 2 for 1 on a depth sensor (I will buy 2 and send you one) if you want to work on something. I have been waiting for EZ-Robot to provide a LIDAR to do a SLAM. This in conjunction with the unity work going on would be exciting. Maybe we should just look at getting a couple of https://click.intel.com/realsense.html for now.
Off topic or on topic (Not sure any more) I have my latest photos of bicycle cards Ace to 6. I can send you a link to cards off line as you own a deck, but results still are not good. I think I need to string together in multiple AI searches but time delay is an issue and Watson Visual recognition does not seem to support a VR pipeline or linking requests so I have to call multiple VR requests from a single EZB Script based on previous VR outcome ,so takes A LONG TIME. First find Suit (Hearts, Diamonds, Clubs, Spades)=> when suit derived find Picture or number (check if numbers or picture cards)=> then actual card (Number in suit) and it still gets it wrong. Maybe I work on weekend if I have time.
@all I stumbled upon an interesting project lately, I do not know if it is of any value for you guys, but it might be worth checking out...
http://lisajamhoury.com/portfolio/kinectron/
https://kinectron.github.io/docs/intro.html
https://github.com/kinectron/kinectron
have you guys considered this as a mic option?
https://www.seeedstudio.com/ReSpeaker-4-Mic-Array-for-Raspberry-Pi-p-2941.html
"this 4-Mics version provides a super cool LED ring, which contains 12 APA102 programmable LEDs. With that 4 microphones and the LED ring, Raspberry Pi would have ability to do VAD(Voice Activity Detection), estimate DOA(Direction of Arrival) and show the direction via LED ring, just like Amazon Echo or Google Home"