My robot would be sitting at a table, and it would look down at the objects on that table. It would be able identify the objects on the table: salt, pepper, a can of beer, a knife, a plate a spoon, and carrots on the plate. It would also know the coordinates of each object in 3D space of each object. Knowing that it would reach out and pick up the beer. I think Tensorflow can do some of this already. Microsoft Cognitive vision gives you the objects it sees, but not the locations.
Other robots from Synthiam community

Wallekid1's Wall-E Made From Scratch
No Kidding, I actually made it from scratch, its not a toy It was quite a big project and I worked on it for several...

Jim's Project Armadeus
For the last decade or more Ive been working on a series of large robots with articulated arms. I just finished the...

Robo-Chess's Hector
I have been fascinated by humanoid robots for years, gathering extensive knowledge and materials about projects like...