How to add the Cognitive Vision robot skill
- Load the most recent release of ARC (Get ARC).
- Press the Project tab from the top menu bar in ARC.
- Press Add Robot Skill from the button ribbon bar in ARC.
- Choose the Camera category tab.
- Press the Cognitive Vision icon to add the robot skill to your project.
Don't have a robot yet?
Follow the Getting Started Guide to build a robot and use the Cognitive Vision robot skill.
How to use the Cognitive Vision robot skill
Use the Microsoft Cognitive Computer Vision cloud service to describe or read the text in images. The images come from the Camera Device added to the project. This plugin requires an internet connection. If you are using a WiFi-enabled robot controller (such as EZ-Robot EZ-B v4 or IoTiny), consult their manuals to configure WiFi client mode or add a second USB WiFi adapter.
The Synthiam Cognitive Vision Robot Skill utilizes machine learning algorithms to enable robots to recognize and understand objects, faces, and even emotions. This skill is part of the Synthiam ARC (Autonomous Robot Control) platform, which provides a suite of tools for building and controlling robots.
Here's a breakdown of how the Cognitive Vision Robot Skill works, which you can use to explain it in a manual:
- Integration with ARC: The Cognitive Vision skill is integrated into the Synthiam ARC software, which means it can be easily added to any robot project within the ARC environment.
- Camera Setup: To use the Cognitive Vision skill, the robot must be equipped with a compatible camera that is connected to the ARC platform. This camera serves as the robot's "eyes," capturing visual data for processing.
- Skill Configuration: Once the camera is set up, the Cognitive Vision skill can be configured within the ARC software. Users can select different cognitive services (such as object recognition, face detection, or emotion recognition) that they want the robot to use.
- Machine Learning Models: The skill relies on pre-trained machine learning models that have been developed to recognize various objects, faces, and emotions. These models can be updated and improved over time, enhancing the robot's recognition capabilities.
- Processing Visual Data: When the robot's camera captures visual data, the Cognitive Vision skill processes this data using the selected machine learning models. The skill analyzes the visual input to identify and categorize objects, detect faces, or interpret emotions.
- Outputs and Actions: The results of the cognitive analysis are then outputted within the ARC software. These results can trigger specific actions or behaviors in the robot. For example, if the robot recognizes a specific object, it can be programmed to approach it or manipulate it.
- Customization: Users can customize the Cognitive Vision skill by selecting different recognition models, adjusting confidence thresholds for recognition, and programming unique responses to different visual stimuli.
- Real-time Interaction: The Cognitive Vision skill operates in real-time, allowing the robot to interact dynamically with its environment. This real-time processing is essential for applications where immediate robot response is required.
- User Interface: The ARC platform provides a user-friendly interface for the Cognitive Vision skill, making it accessible for users with varying levels of technical expertise. This interface allows for easy configuration and monitoring of the robot's vision capabilities.
- Integration with Other Skills: The Cognitive Vision skill can be used in conjunction with other skills within the ARC platform, enabling complex and intelligent robot behaviors. For example, it can be combined with navigation skills to allow the robot to move towards recognized objects or individuals.
The behavior control will detect objects using cognitive machine learning. The robot skill will analyze the image, and each detected object will be stored in variable arrays—the width, height, location, and description of each object. The robot skill will also analyze the image for adult content. Use the Variable Watcher to view the detected details in real time.
The configuration menu for this robot skill allows setting values for the scripts and variables that hold the detected data.
This script will be executed after a Detection is completed. You can populate this script to speak the detected object or interact with the detected details.
- Read Text:
This script will be executed after a Read Text is completed. You can populate this script to speak the detected text in the image or perform some interaction with the detected details.
- Detected Scene:
This variable holds the description of the image after a detection. You can use this to get the description of the image. This is populated after a Detect instruction is sent.
This variable holds the confidence value for the information detected. How confident is the system at the detection it had come up with?
- Read Text:
This variable holds the text that was detected in an image. This is populated after a Read Text instruction is sent.
The robot skill can execute scripts after the vision recognition is completed. This allows you to have the robot speak or perform an action based on the detected objects in that script. There is a script for the "Describe", which is executed after the Detect is completed. There is also a script for Read Text, which is executed after a ReadText is executed.
Several control commands for interacting with this robot skill from other skills exist. Specifically, when you want to scan the image for the latest detection information, you must send the Detect controlcommand.ControlCommand("Cognitive Vision", "Detach");
Detaches the robot skill from the current camera.
ControlCommand("Cognitive Vision", "Detect");
Simulates pressing the "detect" button. This will detect the objects in the current camera frame and populate the image description variable. The variable can be configured in the configuration menu of this robot skill.
ControlCommand("Cognitive Vision", "ReadText");
Translates any text in the image into the appropriate variable configured in this robot skill's configuration menu.
The Robot Program created this educational tutorial for using the Cognitive Vision behavior control by Synthiam. This same procedure can be executed on any robot with a camera or PC with a USB Camera.
What Can You Do?
An easy example of using this control is adding this simple line of code to the control config. The code will speak out of the PC speaker what the camera sees. Here's a sample project: testvision.EZB
DJ Sures from Synthiam created this demo using a Synthiam JD by combining this Cognitive Vision behavior control, Pandora Bot, and speech recognition. He could have conversations with the robot, which is quite entertaining!
You will need a Camera Device and this plugin added to the project. It would look like this...
And add this simple line of code to the plugin configuration...
say("I am " + $VisionConfidence + " percent certain that i see " + $VisionDescription)
Limited Daily Quota
This robot skill uses a shared license key with Microsoft, enabling ARC users to experiment and demo this robot skill. The shared license key provides a daily quota of 500 requests per day for ARC Pro users. Because this robot skill uses a 3rd party service that costs Synthiam per transaction, the daily quota is designed not to exceed our spending limit. If your application requires a higher daily quota, we will provide a robot skill that allows you to specify your license key to pay Microsoft service directly. Contact Us for more information.