Cognitive Vision

How to add the Cognitive Vision robot skill

  1. Load the most recent release of ARC (Get ARC).
  2. Press the Project tab from the top menu bar in ARC.
  3. Press Add Robot Skill from the button ribbon bar in ARC.
  4. Choose the Camera category tab.
  5. Press the Cognitive Vision icon to add the robot skill to your project.

Don't have a robot yet?

Follow the Getting Started Guide to build a robot and use the Cognitive Vision robot skill.

How to use the Cognitive Vision robot skill

Use the Microsoft Cognitive Computer Vision cloud service to describe or read the text in images. The images come from the Camera Device added to the project. This plugin requires an internet connection. If your are using a WiFi enabled robot controller (such as Synthiam EZ-B v4 or IoTiny), lease consult their manuals to configure WiFi client mode or add a second USB WiFi adapter from this tutorial.


The behavior control will detect objects using cognitive machine learning. The robot skill will analyze the image, and each detected object will be stored in variable arrays—the width, height, location, and description of each object. The robot skill will also analyze the image for adult content. Use the Variable Watcher to view the detected details in real-time.

Educational Tutorial

The Robot Program created this educational tutorial for using the Cognitive Vision behavior control by Synthiam. This same procedure can be executed on any robot with a camera or PC with a USB Camera.

What Can You Do?

An easy example of using this control is adding this simple line of code to the control config. The code will speak out of the PC speaker what the camera sees. Here's a sample project: testvision.EZB


DJ Sures from Synthiam created this demo using a Synthiam JD by combining this Cognitive Vision behavior control, Pandora Bot, and speech recognition. He could have conversations with the robot, which is quite entertaining!

You will need a Camera Device and this plugin added to the project. It would look like this...

And add this simple line of code to the plugin configuration...

say("I am " + $VisionConfidence + " percent certain that i see " + $VisionDescription)

Limited Daily Quota

This robot skill uses a shared license key with Microsoft, enabling ARC users to experiment and demo this robot skill. The shared license key provides a daily quota for ARC Pro users of 500 requests per day. Because this robot skill uses a 3rd party service that costs Synthiam per transaction, the daily quota is designed not to exceed our spending limit. If your application requires a higher daily quota, we will provide a robot skill that allows specifying your license key to pay Microsoft service directly. Contact Us for more information.


Upgrade to ARC Pro

Unleash your creativity with the power of easy robot programming using Synthiam ARC Pro

#1   — Edited
I just watched the videos on these services and tried to set up a text read and he just says he is 87 percent he sees the words..but not the actual words. Any one with pointers to get this to work? I tried both hand written and typed words. Do they need to be a certain size? Or this service from microsoft needs work? 


say("I am " + $VisionConfidence + " percent certain that i see the words" + $VisionReadText)
#2   — Edited
1) The code doesn't have space when appending $visionReadText to the string, which means it will would be a real weird word (i.e. wordsometextthatwasdetected). Your code should be...


say("I am " + $VisionConfidence + " percent certain that i see the words " + $VisionReadText)

*Notice the space after "words"

2) i tested and it works fine reading text. Re-check your code and see if it works without spaces 

User-inserted image
How did I miss that..cross eyed! Thank you ..once again! Must sleep..
#4   — Edited
...odd the new version filters adult content?! How is that a feature? What was Microsoft trying to prevent/use cases? Especially if you think of all the other filters they could have added for whats found in an image.
#5   — Edited
I don’t believe there’s any filtering being done. There is a value of how much adult content there is, but I don’t believe anything is filtered

you can always stand nude in front of your robot to test it out hahaha
#6   — Edited
Yeah your right not a filter more of a tag. Still wondering why that’s a feature . Why not something useful like cats chasing dogs?

I did a little research earlier today and it’s possible to create custom object detection projects. Train and prediction. Makes it more useful for a case by case robot. 

..of course I got naked in front of the vision cognition said '100% sure you should put your clothes back on!' Lol.
Ya it’s a rating - it’ll help some applications to prevent abuse. I know we’re using it for a new service we’re releasing in beta next week. 

ill take a look at the custom detection part. Although it is quite easy to do local with the object tracking built in the camera control
Looking forward to that beta!
Is cognitive emotion redundant, because Cognitive Face reports back the same emotions as well as the other options: age, name, etc? Not sure if I am missing something.
#10   — Edited
This skill, Cognitive vision does not return any face or emotional information.

I believe you are asking about Cognitive Face and Cognitive Emotion? Those two report similar stuff, except Emotion doesn't report face. There's slight differences in the returned data of those two. This skill that you replied to is Cognitive Vision and not related to either of those:)
I see the parameter you need to pass in the ControlCommand to read text is "ReadText". But what parameter do you send to describe an image? "DescribeImage"? 

Thomas Messerschmidt
#12   — Edited