My robot would be sitting at a table, and it would look down at the objects on that table. It would be able identify the objects on the table: salt, pepper, a can of beer, a knife, a plate a spoon, and carrots on the plate. It would also know the coordinates of each object in 3D space of each object. Knowing that it would reach out and pick up the beer. I think Tensorflow can do some of this already. Microsoft Cognitive vision gives you the objects it sees, but not the locations.
Other robots from Synthiam community

Doombot's Archetype Finally Complete
Well its never really complete right? Heres my first (whew!) fully working droid, Archetype. As you can see its somewhat...

Xuven's Project Atlas 1.0
Inspired by alot of amazing robotics enthusiasts ,project gizmo. Just starting out he will be nothing special. Some head...

DJ's Detect Multiple Colors
One of the features that makes ez-robot so special is the camera that can detect faces, objects, glyphs, qr codes and...