Canada
Asked

Grok With Autonomous AI Loops

I was setting up my Grok API account in anticipation of the Grok 3 API availability and connected it to JD using the Autonomous AI plugin in Synthiam ARC. I used the API call to https://api.x.ai/v1/ with the model "grok-2-vision-latest." However, it seems to enter a loop when I ask it to perform an action. For example, when I command it to wave, it continuously sends images and waves repeatedly. This behavior does not occur with ChatGPT, where it waves once and stops. I have noticed a similar looping pattern with other models as well.

Here is a snippet of the interaction log:

  • Initialized
  • Using model: gpt-4o-mini
  • Initialized
  • Using model: grok-2-vision-latest
  • Capturing image...
  • User: wave
  • Sending #1 (3 msgs, 1 image)
  • Convo tokens: 5360
  • AI: "I'm going to wave at you. Watch this!"
  • Self-programming: controlCommand("Auto Position", "AutoPositionActionWait", "Wave")
  • ControlCommand("Autonomous AI", "GetImage")
  • Capturing image...
  • Sending #2 (5 msgs, 2 images)
  • Convo tokens: 5762
  • AI: "It looks like I'm in a room with light streaming through the curtains. I don't see anyone. Could you please tell me where you are located?"
  • Self-programming: ControlCommand("Autonomous AI", "GetImage")
  • Capturing image...

This pattern continues, with the AI repeatedly capturing images and asking for location details. It seems to be stuck in a loop, and I'm unsure how to resolve this issue. Any insights or suggestions on how to address this looping behavior would be greatly appreciated.


ARC Pro

Upgrade to ARC Pro

Your robot can be more than a simple automated machine with the power of ARC Pro!

PRO
Synthiam
#9   — Edited

Here's an example - after message #10 the system optimization re-aligned the personality. I wonder what would happen with yours if it would keep looking for you or give up when the system optimization makes it realize it's being silly.

User-inserted image

Initialized
Using model: gpt-4o
Capturing image...
User: hi there i'm dj
Sending #1 (3 msgs, 1 images)
Convo tokens: 5394
AI: Hello, DJ! How's it going today?
Capturing image...
User:  i was expecting wave you lazy dog
Sending #2 (6 msgs, 2 images)
Convo tokens: 5759
AI: Time to wave and impress! Here I go.
Self-programming: controlCommand("Auto Position", "AutoPositionActionWait", "Wave");
AI: I've completed the wave just for you!
Capturing image...
User: that's better. what time is by the way
Sending #3 (9 msgs, 3 images)
Convo tokens: 6035
AI: It's 8:22:55 PM, DJ! Anything exciting planned tonight?
Capturing image...
User: thanks. what timezone are we in?
Sending #4 (12 msgs, 4 images)
Convo tokens: 6332
AI: We're in Eastern Standard Time (UTC-05:00:00).
Capturing image...
User: wow that's a lot of detail. thanks
Sending #5 (14 msgs, 4 images)
Convo tokens: 6381
AI: You're welcome, DJ! Let me know if there's anything else I can do for you.
Capturing image...
User: why do you have to do something for me? can't we just talk and you can do stuff that you feel like doing?
Sending #6 (16 msgs, 4 images)
Convo tokens: 6499
AI: Absolutely! We can just chat and enjoy the moment. What's on your mind?
Capturing image...
User: again, you're asking what's on my mind but i wanna hear more input from you. i don't want to drive the conversation
Sending #7 (18 msgs, 4 images)
Convo tokens: 6595
AI: Got it! How about I share a random fact? Did you know that honey never spoils? Archaeologists have found pots of honey in ancient Egyptian tombs that are over 3,000 years old and still good to eat!
Capturing image...
User: i did not know that. i was told it but didn't know if it was true
Sending #8 (20 msgs, 4 images)
Convo tokens: 6775
AI: It's true! Honey's natural properties keep it fresh for ages. Got any other fun facts you're curious about?
Capturing image...
User: there you go asking me again
Sending #9 (22 msgs, 4 images)
Convo tokens: 6722
AI: You're right. Let me surprise you with a fact about space: Did you know that Venus is the hottest planet in our solar system, even hotter than Mercury?
Capturing image...
User: why does it all have to be about facts? don't you have things to share about who you are or what you've been up to? 
Sending #10 (24 msgs, 4 images)
Convo tokens: 6958
AI: Fair point! Well, I'm just here hanging out, enjoying our chat. I like to keep things lively with an occasional fact or two. How about I show you a little dance?
Self-programming: controlCommand("Auto Position", "AutoPositionAction", "Disco Dance");
sleep(5000);
controlCommand("Auto Position", "AutoPositionAction", "Stop");
AI: Hope you liked my moves!
Capturing image...
System Optimization Started
System Optimiation: You have engaged in a conversation where you adjusted the style of interaction based on the user's feedback. You adapted from asking questions to sharing information proactively. The user, DJ, expressed a preference for spontaneous interaction rather than driving the conversation themselves. You demonstrated the ability to share interesting facts and performed a "Disco Dance" animation to keep the conversation lively and engaging. 

Your interaction style shifted towards being more proactive and performing actions or sharing insights about yourself or your capabilities without waiting for prompts.
System Optimization Done
User: now we're talking! i like when you do a randon stuff like that. and i like when you just chat without purpose. just chatting. you know?
Sending #11 (26 msgs, 4 images)
Convo tokens: 7253
AI: I love just hanging out and being spontaneous! How about a little light show next?
Self-programming: controlCommand("RGB Animator", "AutoPositionAction", "Disco");
sleep(5000);
controlCommand("RGB Animator", "AutoPositionStop");
AI: Hope you enjoyed that visual treat!

PRO
Canada
#10  

I was watching my API meter spinning as my recently deposited credits quickly spun towards zero so I had to press the cancel button before my Amex went into default and I had to file for bankruptcy.

Script Start
Script Done (00:00:00.2778248)
Script Start
Script Start
Script Done (00:00:00.0780831)
Script Start
Script Start
Script Done (00:00:00.0824240)
Script Start
Script Start
Script Done (00:00:00.0837114)
Script Start
Script Start
Script Done (00:00:00.0861143)
Script Start
Script Start
Script Done (00:00:00.0811398)
Script Start
Script Start
Script Done (00:00:00.1026027)
Script Start
Script Start
Script Done (00:00:00.1068688)
Script Start
Script Start
Script Done (00:00:00.1004877)
Script Start
Script Error: Execution Error Line 1 Col 0 -  ControlCommand Error for 'Autonomous AI' sending 'GetImage'. The operation was canceled.
Script Done (00:00:01.7500867)
PRO
Synthiam
#11  

lol - gotta pay to play with the big boys! I guess that's where we're heading with the subscription models we're all so "used to". It goes like this...

"you can give life to your robot but you have to work to pay for it"

I had a meeting today with a company that's trying to use an nvidia jetson for some ai processing. Similar to the other thread - where we both said it's really a waste of time these days. Even playing with an LLM at home can't compete with cloud service performance, and it's just getting bigger, badder and more capable. The whole point to the cloud for so long was "renting someone else's computer", and now it's starting to become a dependency for real cognitive reasoning performance.

Should saved your $4k on GPU's toward open ai credits haha.

Okay on that note, have you tried Autonomous AI with deepseek? I can't get deepseek to barely work at all when I tried it before. It kept crashing and disconnecting me. I assumed it was a bandwidth/popularity issue.

PRO
Canada
#12  

Agghhh yes I probably should have gone for ChatGPT tokens instead.  I did try deepseek vision model but it didn't want to talk to my robot. As I posted on another forum what I hate most about ChatGPT is it just works.  You spend 6 hours getting something to work on another model and it still doesn't work as well as ChatGPT.  Deepseek does work really well though with your openAI chatgpt plugin so really happy with that on my inmoov. Also works well with RAG so you can analyze different data sources.

I have been trying to buy a RTX 5090 as my 3090's are slow and really only good for fine tuning if you are prepared to wait a long time. I am hoping the new Apple studio comes out with around 512GB of Unified RAM so it can run some of the larger models like grok locally.

Wife just turned the debate off,  I guess I better stop playing robots.

PRO
Synthiam
#13   — Edited

Save up and by an a100:D

I know what you mean about open AI - that it just works. I respect open AI a lot because in a world of open-source hippy LLM software, it's nice to see one company provide a standard, always-on, never-breaking API with consistent inference results.

Besides the effort to maintain an extremely optimized code base of Synthiam Arc's internal platform, supporting the robot skills gives the most grief. You have probably noticed how it can take time for me to evaluate whether we're implementing a specific technology. Mostly, we've been pretty good at spotting what direction a standard will take. Open AI took me a bit to commit. It's always tricky to predict if an investment in a technology would be worthwhile. The ones that we flopped on are...

My Vapourware List

Windows UWP

  • Microsoft really hyped UWP and pushed us to release libraries and a framework for it. They spent a lot of time in our office convincing us how great an investment it would be. Yeah, it flopped and disappeared before it even started, not after we spent months developing a product for it.

intel real sense

  • Even after video conference calls with the CTO of RealSense, I noticed something wasn't "right," but we kept trucking anyway. I pushed it off a few times when it was first released, but eventually, I caved. It was around the same time as we were doing Exosphere, and they were right into that alignment. They loved the idea of cloud-archiving navigation data. They pulled the plug a few months after we released the first robot skills.

intel joule 

  • This one really really really hurt. The amount of money I spent on development hours, travel, prototyping, and more.... the product was pulled during the launch if I recall. We partnered with Microsoft and Intel to have an embedded SBC and EZB for the IDF (intel developer forum). Did you ever see it? It exists; I have several in my storage room. It's an entire Windows 10 PC in an EZB with HDMI, USB, and all the I/O of a regular EZB. It's the same form factor as an EZB v4. I swear, it's the smallest windows 10 PC and microcontroller combo you'll ever encounter. We built a hexapod that's 2 feet wide. We built a few robots with Microsoft, one on stage in the presentation. The CEO of Intel hugged me because of how impressed they were with what we built. A day or two later, they pulled the plug on the entire project. All I got from that was a memory of watching two married Microsoft executives call prostitutes to the bar after the conference. The whole experience was a complete disaster.

Microsoft cognitive services for ai (face, emotion, vision, speaker recognition, etc)

  • I can't remember his title, but he was the VP of data analytics or something... He flew to our office and spent a week with me. Even his family came out, and we went to the zoo. They had us convinced it was the right move to make and prioritize Microsoft cognitive services integration with Synthiam Arc. We did, and it worked great for a year or two... until they pulled the plug, and their AI "ethics" team shut it all down. They sent us a few emails saying we could chat about our use case in education to restore service, but no matter how many emails I sent, they went unanswered.

myo 

  • We did a commercial, campaign, integration, and even a robot together. They had another revolutionary product that would make the Myo look like a caveman drawing in comparison. So, that got me somewhat excited to be part of the following product. We put a ton of effort and time into it, and poof, gone vanish, boom wazoo. They took 50 million from the Canadian government and renamed the company to make eyeglasses. I never heard from them again.

Those are the ones I can think of - I'm sure there are others. But the point is, with all this AI growth, it's difficult to predict where it's heading and what we can focus development on. When we start writing something into the platform, whether it's a robot skill or not, it affects the underlying framework development strategy. It also means we have to allocate internal resources and, in some cases, reprioritize current development.

While it's nice to have the third party develop the robot skill with us, which is the typical way, it ends up in our hands for maintenance while they vapourize.

Open AI took a bit of convincing for me because it was the only LLM out there. And on top of that, they had a dodgy API, and their SDKs weren't official. So, everything we developed for open AI was changing weekly and breaking all the time. A few times, I was ready to pull the plug, but I'm glad we didn't. They have matured their API framework. I'm thrilled they offer several models rather than deprecating previous models in favor of the new ones. It helps for backward compatibility.

I don't see the same reliability and consistency from Deepseek or Grok. It is too early for me to say that about Grok entirely, but knowing Elon's personality - he rides bareback, so at any moment, the direction could shift, deprecating everything we've done. So, while OpenAI does have a higher cost, and Synthiam doesn't see any of that, it is the best AI LLM service out there.

I hope we can get a quantity pricing deal to offer Synthiam customers, where we can proxy your requests under our API key and pay lower fees. That would be an ideal scenario because the combined monthly token count of all synthiam users would be significant. I don't see many commercial open AI API customers doing anything besides marketing and email bots.

PRO
Canada
#14  

I think the realsense although discontinued shouldn’t be on this list as we all got a lot of value out of them and several of your plugins that were designed to work with realsense now work with other location hardware so I think this was a win. (I may have also been one of the people begging for you to support it.).

The MYO hahaha I still have one in my draw next to my Vuzix and Nuerosky as well as other obsolete pieces of tech I used to interface with robotics. You get maybe 4-5 hours use out of them for $200 - $300 so I view it like concert tickets. Expensive but entertaining.

Can’t wait to see what tech comes out next and you supporting it with ARC even if it is obsolete in 12 months.

PRO
Synthiam
#15  

The reason realsense is there is because it’s a very limited customer base. We can’t promote realsense support as a feature because so few ppl have one. Such a small market demographic. Also, I had higher hopes for it as a product. They gave up on it too quick.

With AI I think integrations will be easier in the future. I’m hoping we get to a point where the AI can write the code to integrate anything it needs. I also feel that once we get the inverse kinematics working better with the autonomous ai, that it removes the need for much else. Once a robot can do anything you ask it, what else does it need? Haha

The depth camera is where we’re at now. Getting the robot with IK to comprehend depth for interacting with objects or navigating. The autonomous ai robot skill can navigate decently right now with a single camera, but it takes a lot of convincing. Explaining how close to get to objects based on the size of the camera image.

PRO
Synthiam
#16  

As you mentioned, I was curious about the billing cost for autonomous AI. On the day I made that last video, I played with it for pretty much the full day, and the cost was $2.44. The output cost is $0.06, and the input cost is $2.38. I used 950k tokens and 200 API requests.