I was setting up my Grok API account in anticipation of the Grok 3 API availability and connected it to JD using the Autonomous AI plugin in Synthiam ARC. I used the API call to https://api.x.ai/v1/
with the model "grok-2-vision-latest." However, it seems to enter a loop when I ask it to perform an action. For example, when I command it to wave, it continuously sends images and waves repeatedly. This behavior does not occur with ChatGPT, where it waves once and stops. I have noticed a similar looping pattern with other models as well.
Here is a snippet of the interaction log:
- Initialized
- Using model: gpt-4o-mini
- Initialized
- Using model: grok-2-vision-latest
- Capturing image...
- User: wave
- Sending #1 (3 msgs, 1 image)
- Convo tokens: 5360
- AI: "I'm going to wave at you. Watch this!"
- Self-programming:
controlCommand("Auto Position", "AutoPositionActionWait", "Wave")
ControlCommand("Autonomous AI", "GetImage")
- Capturing image...
- Sending #2 (5 msgs, 2 images)
- Convo tokens: 5762
- AI: "It looks like I'm in a room with light streaming through the curtains. I don't see anyone. Could you please tell me where you are located?"
- Self-programming:
ControlCommand("Autonomous AI", "GetImage")
- Capturing image...
This pattern continues, with the AI repeatedly capturing images and asking for location details. It seems to be stuck in a loop, and I'm unsure how to resolve this issue. Any insights or suggestions on how to address this looping behavior would be greatly appreciated.
Here's an example - after message #10 the system optimization re-aligned the personality. I wonder what would happen with yours if it would keep looking for you or give up when the system optimization makes it realize it's being silly.
I was watching my API meter spinning as my recently deposited credits quickly spun towards zero so I had to press the cancel button before my Amex went into default and I had to file for bankruptcy.
lol - gotta pay to play with the big boys! I guess that's where we're heading with the subscription models we're all so "used to". It goes like this...
"you can give life to your robot but you have to work to pay for it"
I had a meeting today with a company that's trying to use an nvidia jetson for some ai processing. Similar to the other thread - where we both said it's really a waste of time these days. Even playing with an LLM at home can't compete with cloud service performance, and it's just getting bigger, badder and more capable. The whole point to the cloud for so long was "renting someone else's computer", and now it's starting to become a dependency for real cognitive reasoning performance.
Should saved your $4k on GPU's toward open ai credits haha.
Okay on that note, have you tried Autonomous AI with deepseek? I can't get deepseek to barely work at all when I tried it before. It kept crashing and disconnecting me. I assumed it was a bandwidth/popularity issue.
Agghhh yes I probably should have gone for ChatGPT tokens instead. I did try deepseek vision model but it didn't want to talk to my robot. As I posted on another forum what I hate most about ChatGPT is it just works. You spend 6 hours getting something to work on another model and it still doesn't work as well as ChatGPT. Deepseek does work really well though with your openAI chatgpt plugin so really happy with that on my inmoov. Also works well with RAG so you can analyze different data sources.
I have been trying to buy a RTX 5090 as my 3090's are slow and really only good for fine tuning if you are prepared to wait a long time. I am hoping the new Apple studio comes out with around 512GB of Unified RAM so it can run some of the larger models like grok locally.
Wife just turned the debate off, I guess I better stop playing robots.
Save up and by an a100
I know what you mean about open AI - that it just works. I respect open AI a lot because in a world of open-source hippy LLM software, it's nice to see one company provide a standard, always-on, never-breaking API with consistent inference results.
Besides the effort to maintain an extremely optimized code base of Synthiam Arc's internal platform, supporting the robot skills gives the most grief. You have probably noticed how it can take time for me to evaluate whether we're implementing a specific technology. Mostly, we've been pretty good at spotting what direction a standard will take. Open AI took me a bit to commit. It's always tricky to predict if an investment in a technology would be worthwhile. The ones that we flopped on are...
My Vapourware List
Windows UWP
intel real sense
intel joule
Microsoft cognitive services for ai (face, emotion, vision, speaker recognition, etc)
myo
Those are the ones I can think of - I'm sure there are others. But the point is, with all this AI growth, it's difficult to predict where it's heading and what we can focus development on. When we start writing something into the platform, whether it's a robot skill or not, it affects the underlying framework development strategy. It also means we have to allocate internal resources and, in some cases, reprioritize current development.
While it's nice to have the third party develop the robot skill with us, which is the typical way, it ends up in our hands for maintenance while they vapourize.
Open AI took a bit of convincing for me because it was the only LLM out there. And on top of that, they had a dodgy API, and their SDKs weren't official. So, everything we developed for open AI was changing weekly and breaking all the time. A few times, I was ready to pull the plug, but I'm glad we didn't. They have matured their API framework. I'm thrilled they offer several models rather than deprecating previous models in favor of the new ones. It helps for backward compatibility.
I don't see the same reliability and consistency from Deepseek or Grok. It is too early for me to say that about Grok entirely, but knowing Elon's personality - he rides bareback, so at any moment, the direction could shift, deprecating everything we've done. So, while OpenAI does have a higher cost, and Synthiam doesn't see any of that, it is the best AI LLM service out there.
I hope we can get a quantity pricing deal to offer Synthiam customers, where we can proxy your requests under our API key and pay lower fees. That would be an ideal scenario because the combined monthly token count of all synthiam users would be significant. I don't see many commercial open AI API customers doing anything besides marketing and email bots.
I think the realsense although discontinued shouldn’t be on this list as we all got a lot of value out of them and several of your plugins that were designed to work with realsense now work with other location hardware so I think this was a win. (I may have also been one of the people begging for you to support it.).
The MYO hahaha I still have one in my draw next to my Vuzix and Nuerosky as well as other obsolete pieces of tech I used to interface with robotics. You get maybe 4-5 hours use out of them for $200 - $300 so I view it like concert tickets. Expensive but entertaining.
Can’t wait to see what tech comes out next and you supporting it with ARC even if it is obsolete in 12 months.
The reason realsense is there is because it’s a very limited customer base. We can’t promote realsense support as a feature because so few ppl have one. Such a small market demographic. Also, I had higher hopes for it as a product. They gave up on it too quick.
With AI I think integrations will be easier in the future. I’m hoping we get to a point where the AI can write the code to integrate anything it needs. I also feel that once we get the inverse kinematics working better with the autonomous ai, that it removes the need for much else. Once a robot can do anything you ask it, what else does it need? Haha
The depth camera is where we’re at now. Getting the robot with IK to comprehend depth for interacting with objects or navigating. The autonomous ai robot skill can navigate decently right now with a single camera, but it takes a lot of convincing. Explaining how close to get to objects based on the size of the camera image.
As you mentioned, I was curious about the billing cost for autonomous AI. On the day I made that last video, I played with it for pretty much the full day, and the cost was $2.44. The output cost is $0.06, and the input cost is $2.38. I used 950k tokens and 200 API requests.