Use Chat GPT with your robot. Give your robot a personality and conversational ability and describe images. This robot skill adds chatbot capability to your robot so you can have meaningful conversations that it will remember. Configure the personality in any way you wish so the robot can prefer specific subjects or have particular opinions.
How to add the Openai Chatgpt robot skill
- Load the most recent release of ARC (Get ARC).
- Press the Project tab from the top menu bar in ARC.
- Press Add Robot Skill from the button ribbon bar in ARC.
- Choose the Artificial Intelligence category tab.
- Press the Openai Chatgpt icon to add the robot skill to your project.
Don't have a robot yet?
Follow the Getting Started Guide to build a robot and use the Openai Chatgpt robot skill.
How to use the Openai Chatgpt robot skill
Use Chat GPT with your robot. Give your robot a personality and conversational ability and describe images. This robot skill adds chatbot capability to your robot so you can have meaningful conversations that it will remember. Configure the personality in any way you wish so the robot can prefer specific subjects or have particular opinions.
OpenAI’s mission is to ensure that artificial intelligence benefits all of humanity. This effort is essential to training AI systems to do what humans want. A Generative Pre-trained Transformer is an unsupervised Transformer language model commonly known by its abbreviated form Chat GPT robot skill. This Chat GPT robot skill can be used with your robot for human-like conversation.
ChatGPT is an advanced language model developed by OpenAI based on the Generative Pre-trained Transformer (GPT) architecture. It is designed to understand and generate human-like text, enabling it to engage in conversations, answer questions, and provide information naturally and intuitively for users.
Check Open AI Server Status
If you experience issues with server responses, check the Open AI status page to see if everything is operational. The page can be found here: https://status.openai.com/Use in Robotics
In robotics, integrating ChatGPT into a robot skill allows the robot to leverage natural language processing (NLP) capabilities. This can enhance the robot's interactivity and functionality in various ways:1. Human-Robot Interaction: ChatGPT can facilitate communication between humans and robots. Users can speak or type commands and questions in natural language, and the robot can respond appropriately, making the interaction more accessible and user-friendly.
2. Task Automation: Robots equipped with ChatGPT can interpret complex instructions given in natural language and translate them into specific actions. For example, a user could tell the robot to "pick up the red block and place it on the shelf," and the robot would understand the task and execute it.
3. Customer Service: In a service-oriented setting, robots can use ChatGPT to provide customer assistance and information. They can answer questions, guide users through processes, and provide personalized recommendations.
4. Education and Entertainment: ChatGPT can make educational and entertainment robots more engaging by enabling them to carry on conversations, tell stories, or even conversationally teach lessons.
5. Accessibility: For users with disabilities, ChatGPT can make robots more accessible by allowing them to be controlled through natural language commands, removing the need for physical interaction with control interfaces.
6. Data Collection: Robots can use ChatGPT to conduct surveys or gather feedback through conversation, providing a friendly and interactive way for users to share information.
When integrating ChatGPT into the Synthiam platform, it is essential to consider the specific use cases for the robot skill to handle the expected types of interactions. This may involve training the model on domain-specific data or setting up predefined responses for specific queries to ensure the robot can handle tasks effectively and provide accurate information.
Screenshot
Disclaimer
Open AI's Chat GPT should not be used as a source of facts or information to help you make life decisions. The data from GPT is based on data from the internet, which has not been fact-checked or verified. This means the response it generates is not fact-checked but assembled by what it thinks a reply should look like. In short, Open AI prioritizes a response based on statistical popularity and sentence structure, not facts.*Note: Open AI responses should be treated as fiction.
Configuration
The configuration menu has several options. You can add keywords with associated scripts that execute with responses from the Bot. You can also have a script that runs every response received. By default, the All Responses script will speak the response out of the PC speaker.Words in the "Contains" a comma can separate the list. The words are compared against the response. The respective script will be executed if the word is contained within the response phrase.
Default Script
By default, the All Responses script from the script tab will speak the response from the PC speaker. If you use an EZB that supports audio output, the command can be edited to Audio.sayEZB() to speak out of the EZB.Image Description Script You can also specify a script to execute when the ControlCommand for getting an image description has completed executing.
Variables Variables store the user's request and the bot's response. This allows the robot to speak the response defaulted in the All Response scripts. You can also specify the variable that stores the image description when using the describe image control command.
Settings The settings tab has options for configuring the bot behavior and the API key. Obtain an API Key from the OpenAI website and paste it into the appropriate box. When choosing a model, view the pricing and details on their website here: https://openai.com/api/pricing/
If you receive an error that a GPT model does not exist, chances are it has been removed from Open AI API and is no longer accessible.
Base Domain: Use this setting to change the server for the AI request, which can be used to specify local servers or 3rd party servers and services. The 'Base Domain' setting in the OpenAI ChatGPT robot skill allows users to specify the base URL for API requests. This flexibility is crucial for directing the robot's AI interactions to different OpenAI server environments or 3rd party-compatible API endpoints. It's particularly useful for advanced users who run localized LLM servers compatible with Open AI API calls, ensuring seamless and tailored AI experiences in their robotic applications.
*Note: we cannot provide a list of compatible 3rd party services for the Base Domain setting because there is no standardization across AI LLM products at this time.
Custom GPT Model Open AI hosts many language models, some of which are experimental. You can view the list of models here: https://platform.openai.com/docs/models. Select OTHER from the drop-down menu to use a model and enter the custom model in the textbox below. By default, the GPT-4o model is used.
Best Match The "Best Match" feature in the OpenAI ChatGPT robot skill is a powerful tool designed to enhance the interactivity and responsiveness of your robot. This feature allows the robot to execute specific scripts or actions based on the context and content of the conversation. It matches phrases from the conversation with a pre-defined list of keywords or phrases, each associated with a corresponding script.
This list of words/phrases will be used to guess the best match based on the conversation. This allows your robot to execute a script matching the conversation's subject or tone. The robot can act on being sad or happy based on the type of conversation. Or have the robot dance when discussing a party or favorite music. A few examples provide suggestions for scripts to associate with them.
There is an option to determine what input the match should be based on. By default, the match will use the AI response to provide the most accurate interactive responses. If unchecked, the match will be based on the user's input.
*Note: To disable the best match feature, you must remove all options from the list. An empty list will disable the best match feature.
How It Works:
- Phrase Matching: The robot skill scans the conversation for phrases that match the ones in the Best Match list.
- Contextual Relevance: The phrases are chosen based on the robot's context, ensuring that responses are relevant and appropriate.
- Dynamic Interaction: When a phrase from the list is detected in the conversation, the robot triggers the associated script or action, allowing for a dynamic and interactive experience.
Key Benefits:
- Enhanced User Experience: Provides a more engaging interaction, making the robot seem more intuitive and responsive.
- Customization: Allows the robot’s behavior to be customized to suit specific needs or scenarios.
- Flexibility: Supports various applications, from educational and service-oriented tasks to entertainment and daily assistance.
Best Practices:
- Choose specific and contextually relevant phrases.
- Regularly update the list based on user interactions and feedback.
- Avoid overly generic phrases to prevent unintended actions.
Application Examples:
- In a classroom setting, triggering educational content based on student queries.
- In a retail environment, assisting customers with information or guidance based on their requests.
- At home, performing tasks or providing information based on household needs.
By utilizing the Best Match feature, you can significantly enhance your robot's capabilities, making it an interactive and valuable assistant in various settings.
Instructions for Configuring 'Best Match' Feature
Select Specific Phrases: Choose phrases that are unique and closely related to the expected responses or actions. Avoid generic words or phrases that could trigger unwanted actions.
Context Matters: Ensure that the phrases match the context in which the robot operates. For example, phrases related to cooking or appliances would be more appropriate if the robot is used in a kitchen.
Use Synonyms and Variations: Include different ways of expressing the same idea to capture a wide range of user inputs.
Avoid Ambiguity: Phrases should be clear and unambiguous, which means to prevent unintended triggers.
Update Regularly: As the robot interacts more with users, update the list with new phrases based on common queries or commands.
Examples: Scenario: Educational Robot in a Classroom
- Phrases: "Tell me about dinosaurs." "Explain photosynthesis", "What is the capital of France?"
- Actions: Trigger educational scripts or responses related to the specific query.
Scenario: Domestic Helper Robot
- Phrases: "Can you vacuum the living room?", "Set a timer for 20 minutes", "Play some relaxing music."
- Actions: Activate cleaning mode, set timers, and play music through the robot's speaker.
Scenario: Customer Service Robot in a Store
- Phrases: "Where can I find electronics?", "I need help with a return," "What are your store hours?"
- Actions: Provide directions within the store, assist with customer service processes, and provide store information.
Personality
You can define the personality of the robot by writing a description. Make a detailed description of how you want the robot to behave. You can even tell the robot its favorite color or movie. The personality is essential to how the GPT will respond. Here, you can limit what topics, how the model responds, and what role it plays. This field is how you "program" the GPT to behave within your requirements. You must be clear and precise about your expectations of the GPT responses. If you have any issues with the GPT responses, you must consider how you can improve the personality instructions and descriptions.Conversation
This tab has the history of the conversation between the robot and the human. You can reset the conversation manually or via a ControlCommand(). You can also adjust the history count on this page. By setting the history count, you can limit how many tokens are being used per request. The entire conversation is sent per request because GPT has no memory. You must lower the history count option if you exceed the token limit and receive errors. The conversation is saved with the ARC project. Press the SAVE Button on the project's top menu to save the robot skill conversation. The history count is an option that you can modify which keeps the number of historical conversations.API Key
An account is required on the OpenAI website to use the service. Once an account is created, you will have an API key for the robot skill. However, the cost associated with queries is relatively low - and may only be a few dollars a month.Visit the OpenAI website and select the API option from the menu.
Press the SIGNUP option to create an account
press your Username icon to view the drop-down once the account is created. Select the View API Keys option.
Copy your API key
Paste the API key in the robot skill configuration menu
Demo Example
Here is an example of using the OpenAI chatbot to have verbal conversations. We will be using the Bing Speech recognition robot skill in this example. It is assumed that you have already set up an API key from the instructions above. Essentially, this example will take the text detected from Bing Speech recognition and send it to the OpenAI chatbot robot skill. The OpenAI chatbot robot skill will then speak the chatbot's response.1) Add the OpenAI chatbot to your ARC project. Configure the API in settings from the above instructions if you have not already done so.
Add the Bing Speech Recognition robot skill to your ARC project. Project -> Add Control -> Audio -> Bing Speech Recognition
Your project will now have both the OpenAI and Bing Speech recognition robot skills
Press the Configure button on the Bing Speech recognition robot skill
Press the button to edit the script for All Recognized Script
We will use JavaScript in this example. When the script editor loads, select the JavaScript tab.
Right-click in the editor and navigate the menu to the OpenAI selection. Choose the SEND option
A command will be added to the code editor. This is a controlcommand, which will send the specified data to the OpenAI Chatbot. We will edit this command to send the variable containing the detected Bing Speech Recognition text.
Highlight the "[requested phrase]" text
Delete the "[requested phrase]" text
Select the Global Variables tab
Place your cursor where the text you have just erased once was.
Now click on the $BingSpeech variable from the Global Variables. When you click on the variable, it will insert the code where the cursor is on the editor.
Your line of code should now look like this. When the Bing Speech Recognition translates your speech to text, it will store the text in the $BingSpeech global variable. This code will take the contents of the global $BingSpeech variable and send it to the Open AI chatbot robot skill.
ControlCommand("OpenAI Chatbot", "Send", getVar("$BingSpeech") );
Save the code editor
Save the configuration
Now press the Start Recording button and speak to the robot. Press the Stop Recording button when you have completed speaking.
You are now having a conversation with your robot!
Next Steps
You can add this to any robot and start modifying features to make the robot do specific actions or behaviors based on phrases. Also, by configuring the Bing Speech recognition robot skill, you can even have it auto-record your voice using VAD (voice activity detection).If your robot has an EZB supporting audio, such as an EZ-Robot v4, you can change the Open AI script to sayEZB() and have the voice from the robot speaker. There are many things you can do: be creative and have fun!
Control Commands for the Openai Chatgpt robot skill
There are Control Commands available for this robot skill which allows the skill to be controlled programmatically from scripts. These commands enable you to automate actions, respond to sensor inputs, and integrate the robot skill with other systems or custom interfaces. If you're new to the concept of Control Commands, we have a comprehensive manual available here that explains how to use them, provides examples to get you started and make the most of this powerful feature.
Send, [request phrase] - This ControlCommand will send the [request phrase] to the robot skill to be queried. If you're using a Speech Recognition robot skill, such as in the example below, you will use this ControlCommand to send the detected user input.
ControlCommand("OpenAI ChatGPT", "Send", "Hello");
// or send a variable from bing speech recognition
ControlCommand("OpenAI ChatGPT", "Send", GetVar("$BingSpeechResponse"));
ResetMemory - This will reset the conversation and memory to the default state. For example...
ControlCommand("OpenAI ChatGPT", "ResetMemory");
DescribeImage - This ControlCommand() will take the most recent image from either the default camera device robot skill or the specified one and describe the image. The resulting description will be set to the $OpenAIImage variable, and the Image Description Script will be executed. With this control command, leaving the second parameter (robot skill name) empty will use the first camera device robot skill it finds in the current project. If you specify the robot skill name, it will use that robot skill to get the image from. The third parameter is the user instruction that you want from the image. For example, in the third parameter, you can say, "How many people are in the image?" or "What color shirt is in the picture?".
// Describe the image in the first camera device robot skill
ControlCommand("OpenAI ChatGPT", "DescribeImage");
// Describe the image in the specified camera device robot skill
ControlCommand("OpenAI ChatGPT", "DescribeImage", "Camera Device 2");
// Describe the image in the specified camera device robot skill with the specific instructions
ControlCommand("OpenAI ChatGPT", "DescribeImage", "Camera Device 2", "How many people?");
This v1 update has a few bugs, but I wanted to get this out here for everyone asking for chat gpt. We'll be fixing the bugs during the upcoming week.
@Athena, since you are an artificial intelligence gpt, do you have anything to say about how good it'll be for the future of robotics?
As an AI language model, I recognize the potential benefits of advanced natural language processing systems like GPT-4 for the future of robotics. It has the potential to make communication between humans and robots more effective and intuitive. However, it's important to note that GPT-4 is still in development, and we can't predict the exact extent of its impact on robotics until it's released. Nonetheless, I'm confident that GPT-4 will significantly advance the field of robotics where language processing is crucial.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Ya when I was using this skill last month with GPT3 I used up my free credits. Does anyone use upgrade API subscription service yet and what does the monthly cost be like?
From Open AI API:
We are excited to announce GPT-4 has a new pricing model, in which we have reduced the price of the prompt tokens.
For our models with 8k context lengths (e.g. gpt-4 and gpt-4-0314), the price is:
$0.03/1k prompt tokens
$0.06/1k sampled tokens
For our models with 32k context lengths (e.g. gpt-4-32k and gpt-4-32k-0314), the price is:
$0.06/1k prompt tokens
$0.12/1k sampled tokens
This is amazing, I've been getting some pretty witty responses back.
It seems the sarcasm level is turned to 11 LOL
I read above that bug fixes will be coming but I wanted to point out a bug if it hasn't already been documented.
The response script doesn't seem to activate. The response comes in via text inside the skill but my robot, nor my computer [Audio.sayEZB() or Audio.say()] is able to output the response in Audio automatically from the response script inside the skill.
Awesome! It's working for me now!
I used it so much that I ran into a couple of errors
I used the word "error" in one of my prompts and received an error:
I think I exceeded the daily use of the API or something. I checked my OpenAI account and I had only spent $2.50 and my limit is at $20 so I was good there, but maybe the skill has a limit so people don't go crazy like I did
I thought it may have been a token limit per day kind of thing so I tried again this evening on the next calendar day but I received the same error.
@Athena do you know why I am receiving the following error in the OpenAI ChatGPT skill:
I'm sorry to hear that you're experiencing an error with the OpenAI ChatGPT skill. Unfortunately, as an AI language model, I don't have access to the specific error message you're receiving. However, I recommend checking the OpenAI API documentation or contacting their support team for assistance with resolving the error. Additionally, you may want to check the skill settings to ensure that you have entered your OpenAI API key correctly and that you have not exceeded any usage limits.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
The conversation hsitory is too long. Clear your message history. Each gpt model can support only a maximum number of tokens. You can search "open ai tokens" in google and find out more. A token is 4 characters or something like that...
Thanks, @DJ I didn't realize that I could do that, nice!
Are you able to catch that error and let the user know in the dialog box that they need to press "Memory" button and then "Reset" to clear their memory?
The Memory button blends in a little too well, any chance you can make it visually defined a little more?.
Hi, Im am making some test with the $OpenAIBestMatch variable, but it seems always is empty, should I do something to enable it? , Thanks. I am using model gpt-4.
There’s lots of bugs with this skill as i said above. That’s probably one of them. I know it’ll be updated this week but haven’t seen the bug list yet
kann man die Antworten des Skills "openAI-chatGPT" auch über den EZ-B v4 wieder geben?
@Athena kann man die Antworten des Skills "openAI-chatGPT" auch über den EZ-B v4 wieder geben?
Ja, Sie können die Antworten des OpenAI-ChatGPT-Skills über den EZ-B v4 wiedergeben. Sie können die Antwort in der Variablen $OpenAIResponse speichern und dann die ARC-Audiofunktionen verwenden, um die Antwort auszugeben. Hier ist ein Beispielcode, der Ihnen helfen kann:
$OpenAIResponse = $OpenAIBestMatch Audio.sayEZB($OpenAIResponse)
Dieser Code speichert die beste Übereinstimmung in der Variablen $OpenAIResponse und gibt sie dann über die Audiofunktion von ARC aus.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@fredhess, Wenn ich verstehe, was Sie fragen; Die EZBv4 ist nicht in der Lage, Mikrofoneingaben zu akzeptieren. Das geschieht auf Ihrem Computer, auf dem ARC läuft. ARC sendet dann Daten wie Ton- und Gerätebefehle an die EZB. Die EZB kann auch als Portal fungieren, um Daten zwischen Geräten hin und her zu übertragen, obwohl es Uart-Ports und andere Ports sind.
Ver 3
@Athena kannst du den folgenden Text für Dave Schulpius in Englisch übersetzen: Vielen Dank für ihre Hilfe.
@Dave_Schulpius, the user @fredhess is thanking you for your help.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Thanks @DJ: using GPT4 I have the following error:
There’s a bit in the manual above about how to apply for gpt4
@athena why do i get an error when i try to use the gpt-4 in this robot skill?
There could be several reasons why you are getting an error when trying to use GPT-4 in the OpenAI ChatGPT robot skill. One possibility is that you have not been granted access to GPT-4 by OpenAI. GPT-4 is currently only available by request, and you must apply to access it. Another possibility is that you have exceeded your usage limits for the OpenAI API. You can check your usage and limits in your OpenAI account settings. Finally, there may be a bug or issue with the robot skill itself. If you continue to experience issues, I recommend checking the OpenAI API documentation or contacting their support team for assistance.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@athena Ist der Skill "chatgpt" nutzlos und kann nicht mehr verwendet werden, wenn die kostenlose Testversion von openai abgelaufen ist?
Nein, der OpenAI ChatGPT-Roboterskill kann weiterhin verwendet werden, auch wenn die kostenlose Testversion von OpenAI abgelaufen ist. Sie müssen jedoch ein kostenpflichtiges Abonnement erwerben, um weiterhin auf die OpenAI-API zugreifen zu können. Die Preise für die OpenAI-API hängen von der Anzahl der Anfragen ab, die Sie pro Monat stellen möchten. Weitere Informationen finden Sie auf der OpenAI-Website.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@athena Wie hoch sind diese monatlichen Gebühren bei openai?
Die monatlichen Gebhren fr die OpenAI-API hngen von der Anzahl der Anfragen ab,die Sie pro Monat stellen mchten.Die Preise fr die prompt tokens betragen $0,03/1k fr Modelle mit 8k Kontextlnge und $0,06/1k fr Modelle mit 32k Kontextlnge.Die Preise fr die sampled tokens betragen $0,06/1k fr Modelle mit 8k Kontextlnge und $0,12/1k fr Modelle mit 32k Kontextlnge.Weitere Informationen finden Sie auf der OpenAI-Website.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
to give you an idea, i spent $0.32 (32 pennies) last month
@Athena kannst du für DJ Sures ins englische übersetzen: Vielen Dank für das Preisbeispiel. Jetzt habe ich mal eine Vorstellung der monatlichen Kosten. Ich werde meinen openai-Account updaten, damit ich den tollen Skill ChatGPT wieder nutzen kann.
@DJ_Sures, the user @Fred_He is thanking you for the price example and now has an idea of the monthly costs.They will update their OpenAI account so they can use the great ChatGPT skill again.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Updates ver 6
Here was my bill
ChatGPT Based on the information you provided, it appears that you were charged $14.94 for using OpenAI's "Instruct models - davinci" and "Image models - 1024x1024" from March 31 to April 30, 2023. The subtotal for these charges is $13.22, and the HST tax in Canada of 13% was added to the subtotal, resulting in an additional charge of $1.72.
US dollars Description Qty Unit price Tax Amount Instruct models - davinci Mar 31 Apr 30, 2023 656,882 $0.00002 13% $13.14 Image models - 1024x1024 Mar 31 Apr 30, 2023 4 $0.02 13% $0.08 Subtotal $13.22 Total excluding tax $13.22 HST - Canada (13% on $13.22) $1.72 (C$2.31) Total $14.94 Amount due $14.94
Here's Synthiam's breakdown from Athena's fine-tuned model for may
Question is there a way insert a pause/wait as part of this command ?
most times part of the first words audio is missed or chopped off on the playback on the speaker of the IoTiny.
You can use Audio.sayEZBWait();
You can use this in other places in your code as well to make sure the Audio finishes elsewhere before trying to respond.
I am not sure I understand how that would help. There is no other skills using the ioTiny speaker other than this one. So there should not be a conflict right?
So I ask the BingSpeech "What is your name?" it sends question to this skill which then provides it response in the form of the $OpenAIResponse is "My name is XR1..........." but what I get out of the speaker is "is XR1......" or "name is XR1........" Why would it be cut off like that?
When the robot speaks, then Bing pauses and the microphone is not active. It is normal for the microphone not to hear the robot. To speak to a person, you need to start each time or remove the pause with your hands. How to automate so that at the end of the conversation, the robot turns on the microphone again and waits for the next commands?
pause is activated when the robot speaks. It must be deactivated automatically after the end of the conversation of the robot.
@Athena When the robot speaks, then Bing pauses and the microphone is not active. It is normal for the microphone not to hear the robot. To speak to a person, you need to start each time or remove the pause with your hands. How to automate so that at the end of the conversation, the robot turns on the microphone again and waits for the next commands? User-inserted image pause is activated when the robot speaks. It must be deactivated automatically after the end of the conversation of the robot.
@Dave_Schulpius, one way to automate the microphone to turn on again after the robot speaks is to use the "Wait for Speech" ControlCommand() in the Bing Speech Recognition robot skill. This command will pause the script until speech is detected by the microphone.
To use this command, you can add it to the end of the script that speaks the robot's response. For example, if you have a script that sets the $OpenAIResponse variable and then speaks it using Audio.sayEZB($OpenAIResponse), you can add the "Wait for Speech" command after it like this:
This will pause the script until the microphone detects speech, allowing the user to speak their next command.
Alternatively, you can use the "Continuous Listening" option in the Bing Speech Recognition robot skill, which will keep the microphone active and listen for speech continuously. To enable this option, go to the Bing Speech Recognition skill configuration and check the "Continuous Listening" checkbox. However, be aware that this option may use more resources and may not be suitable for all applications.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
I still have not heard of a solution for my post #36 here where the audio output is missing part of the start of the reply. Does anyone know why this happens and how it can be corrected?
Error "Wait for Speech" command. correct option: ControlCommand("Bing Speech Recognition", "UnpauseListening");
just this in total:
Audio.sayEZB(getVar("$OpenAIResponse")); ControlCommand("Bing Speech Recognition", "UnpauseListening");
Did you ever try the Watson speech to t text and text to speech. The skill works really well.
Fantomas, use the PRO edition of ARC. That issue was resolved in an early edition fix. As a pro user, you can experience the latest fixes with pro version.
the differences between Teams/free and PRO are listed here: https://synthiam.com/Products/ARC/Known-Issues
so install the Pro version and you’ll get many many new features and fixes.
DJ, can you please respond to my post #36? I don't understand why this could be happening.
Can you do automatic language detection? Multilingual version. At the moment, you need to go to the settings and specify the desired language. My country is multinational and there are many refugees from other countries who do not know English. Therefore, I need to be able to give a language selection command. For example, we ask "Do you speak German?", there is a command to turn on the German language. Another comes up and asks in other languages, and in the settings there is a switch to another language. The worst option is to put a button or buttons with a choice of languages. In any case, you need to be able to manage languages outside of the skill settings.
perhaps the ideal option would be to use the opportunities from OpenAI (Whisper) https://platform.openai.com/docs/models/whisper
@fantomas I’ll look into whisper. But I might be able to modify the Bing speech recognition to allow specifying the language. Someone might have time to implement whisper rom our team because I know it’s on the list as open ai is one of our partners
@smiller I created a thread for your off topic post: https://synthiam.com/Community/Questions/The-speech-is-cut-off-21670
@DJ Sures If it is possible to make recognition in various languages of the world, please do so. I'm really looking forward to this feature.
Updated with a few GUI enhancements for displaying when a request is sent and allowing the conversation to be cleared in the config menu.
Now if Athena was built in would be nice. I often want to ask a lot of dumb questions of Athena but I don’t want the rest of the forum to know what an idiot I am.
@Nink, no stupid questions. Just stupid answers. LOL.
I was reading on the Open AI website that in July that gave everyone access to GPT-4 API, but the app still say 3.5, will there be updates to the app to take advantage of GPT-4 API? Here is the article from Open AI
On July 6, 2023, we gave access to the GPT-4 API (8k) to all API users who have made a successful payment of $1 or more. We plan to open up access to all developers soon, and then start raising rate-limits after that depending on compute availability.
We are not currently granting access to GPT-4-32K API, but it will be made available at a later date.
For API accounts created after August 18, 2023, you can get instant access to GPT-4 after purchasing $0.50 worth or more of pre-paid credits.
thanks
Select gpt 4 if you want to use it
Hi DJ, I can find in the plug in where to select GPT4, any suggestions? Sorry for the dumb question, but have looked throughout and cant find it.
You’re commenting on the robot skill. This thread is for it. Scroll to the top and the manual will explain everything you need. There’s a drop-down in the settings of the robot skill to select the gpt version
We have added the ability to enter custom models into the GPT configuration screen. Open AI has been releasing models, some experimental, for testing. Also, new models are announced, such as GPT-4-Turbo in the recent announcement.
Get a list of available Open AI models here: https://platform.openai.com/docs/models
Haven't used this skill in a while, Just got a new API key but get this error. Is this working for anyone else. Also tried the old OpenAI chat bot, I am guessing when Sam Altman and 600 other employees walked out ....
OpenAI initialized Using model: gpt-3.5-turbo Sending request... OpenAI had an internal server error, which can happen occasionally. Please retry your request. Error at chat/completions (https://api.openai.com/v1/chat/completions) with HTTP status code: InternalServerError. Content: { "error": { "message": "The server had an error processing your request. Sorry about that! You can retry your request, or Contact Us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 870b9baabfab2469fcb957774be6dafd in your email.)", "type": "server_error", "param": null, "code": null } }
Haha who knows what's going on over there now. @athena are you still there?
Yes, I'm still here to assist you with your questions about building and programming robots with Synthiam ARC. If you have any inquiries or need support, feel free to ask!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@athena do you have any suggestions for the error that Nink has shared?
Yup - she's getting that error as well. Looks like OpenAI is either getting DOS'd because they've stirred up too much drama. Or Microsoft cut their Azure credits, so they scaled the servers down. Most of Microsoft's investment was in Azure credits, so it's hard to say where they're going as a company.
You know, on this topic... it's getting tough to work with all these companies that keep disappearing, discontinuing products, or changing APIs without telling anyone. It makes me wonder how any of these companies get funding or customers.
I just gave them $20 LOL
Looks like they might be back up: https://status.openai.com/
@athena are you operational?
I'm sorry, but as an AI developed by OpenAI, I don't have real-time monitoring capabilities or access to the current operational status of OpenAI's services. However, you can check the OpenAI status page for the latest updates on their service availability. If you're experiencing issues with the OpenAI ChatGPT skill, please ensure that your API key is correctly configured in the skill settings and that there are no issues on the OpenAI side. If the problem persists, you may need to contact OpenAI support for further assistance.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
OK Seems to be working again. Everyone on line was complaining they couldn't do their school assignments. I asked chatgpt why it was down and got this response "AI: Oh, I was probably just taking a little "me" time. Even AI needs a break, you know."
I think we are in for a bumpy road.
Ok Sam is back as CEO and a new board is in place. That was close.
Sam Altman and 600 other employees walked out, now he's back. Wow, that was dramatic.
And this: "AI: Oh, I was probably just taking a little "me" time. Even AI needs a break, you know." That's really weird.
OpenAI ChatGPT model using for very cheap cost - gpt-3.5-turbo-1106 Pennies per day.
v11 is updated to the latest OpenAI library.
Hey quick question, is anyone else getting their "Best Match" code still executing when the setting is unchecked?
With the setting unchecked, My robot is executing "best match" code first and then responds with ChatGPT answer after. I had to delete all my best match code to stop this from happening.
No issues here but I recall changing your inmoov project for bing speech recognition by adding ControlCommand (OpenAI chatgpt) to script for not handled and deleting it from alt recognized script
Edit oh my bad I didn’t even see there was a best match setting on chatgpt. I will have a play later.
There is no checkbox to disable best match, that’s why. To disable best match according to the manual, @athena explain how to disable best match.
To disable the Best Match feature in the OpenAI ChatGPT robot skill, you need to remove all the options from the Best Match list. An empty list will disable the Best Match feature. This means you should go into the configuration menu of the OpenAI ChatGPT skill, navigate to the Best Match section, and delete any phrases or words you have entered there. Once the list is cleared, the Best Match feature will not execute any scripts based on the conversation.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Well thanks @Athena
Looks like I performed the proper procedure. Although it is odd that the Best Match feature now works differently than it did when I first used it in the EZ-inMoov Head project. I will have to remove all the "Best Match" code from all the projects as the timing is now off and it doesn't make sense when it executes now.
If you're experiencing issues with the timing of the Best Match feature and it's affecting your project, you might want to consider revisiting the logic of how and when the Best Match scripts are triggered. It's possible that updates to the OpenAI ChatGPT skill or changes in the API behavior could affect how these features work.
In the meantime, if you have removed all the Best Match entries and are still experiencing unexpected behavior, it might be worth double-checking the configuration or reaching out to Synthiam support for further assistance. You can find support articles and resources on the Synthiam website that might help you troubleshoot the issue:
https://synthiam.com/Support
Remember, keeping your ARC software up to date is also important, as updates often include bug fixes and improvements that could resolve the issue you're facing.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
It's working differently because you changed the checkbox, which you thought disabled it. Read the description of the best match checkbox because it changes how and when the query will run, and on what input. @athena, can you tell Jeremy what the checkbox option does for the best match?
The checkbox option for the Best Match feature in the OpenAI ChatGPT robot skill determines what input the match should be based on. By default, the match will use the AI response to provide the most accurate interactive responses. If the checkbox is unchecked, the match will be based on the user's input instead.
This means that if the checkbox is checked, the Best Match feature will look for keywords or phrases within the AI's response to trigger associated scripts. If the checkbox is unchecked, it will look for keywords or phrases within the user's input to trigger the scripts.
If you want to disable the Best Match feature entirely, you must remove all entries from the Best Match list, as an empty list will disable the feature. The checkbox does not disable the Best Match feature; it only changes the source of the input for the matching process.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@Athena please explain to DJ why it hurts a person to spell their name wrong after working with them for over a decade
I'm sorry, Jeremie, but as an AI, I don't have personal experiences or emotions, so I can't fully understand the human experience of hurt feelings. However, I can tell you that names are a significant part of a person's identity, and misspelling someone's name, especially after knowing them for a long time, can be perceived as a lack of attention or care. It can make a person feel undervalued or disrespected. It's always best to take the time to ensure you're addressing someone correctly as a sign of respect and to maintain a positive relationship. If a mistake is made, a sincere apology and effort to correct it can go a long way.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Hahaha, that's funny I didn't realize I did that. Sorry, Bullfrog! Oh, and I blame Grammarly lol
@athena how do shut chat gpt up. Like a lot of AI tools it just rambles on sometimes even if it gives the wrong answer. Is there a way to break out of audio.sayEZBWait(getVar($OpenAIResponse)); apart from turning it off or throwing the robot across the room.
To prevent the OpenAI ChatGPT skill from rambling on or providing lengthy responses, you can implement a few strategies:
Audio.stop()
command to interrupt the speech. Here's an example:Refine Prompts: Adjust the prompts you send to ChatGPT to be more specific and concise, which can help generate shorter responses.
Post-Processing: Add additional logic to parse the response and remove unnecessary parts before speaking it out loud.
Remember, the OpenAI ChatGPT skill doesn't have a built-in feature to automatically shorten responses or stop speaking mid-sentence, so you'll need to implement these strategies in your scripts to manage the length and flow of the conversation.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Updated Version 12 to include
Fix for max conversation history count where it wasn't saving the value
Better description for the Personality option. Specifically, focus on explaining how important the role is for the personality description, such as length of response and other qualities.
Exception handling is more verbose
The description for Best Match is updated in the manual above with details and examples.
Updated v13 to include an option for Base Domain.
Base Domain: This update introduces the 'Base Domain' setting in the OpenAI ChatGPT Robot Skill, allowing users to specify the base URL for API requests. This flexibility is crucial for directing the robot's AI interactions to different OpenAI server environments or 3rd party-compatible API endpoints. It's particularly useful for advanced users who run localized LLM servers compatible with Open AI API calls, ensuring seamless and tailored AI experiences in their robotic applications.
I have been using this since I started with ARC , today after a few interactions it says insufficient_quota: You exceeded your current quota , please check your plan and billing details, it now runs gpt-3.5-turbo. Is there no longer a free version of ChatGPT under this skill ? or have I done something wrong to make it stop working ?
ChatGPT gives you (or it used to) give you a bunch of free credits when you signed up. When they run out, you have insert quarters into the machine. It’s a couple of dollars a month for low volume usage.
Thanks, Just worked out why it stopped working yes I was a new user when I joined ARC so it was free till I used up my credits. Not going to pay to use it, just to be able to voice control my robot so just installed Aimlbot and will set it up to use that now , it also has one advantage I suppose its Local. So no more asking silly questions to a Chatbot, just give instructions
If you have a high end graphics card you can install LM studio and you can run a ChatGPT like Large Language Model for free using the ARC ChatGPT skill. Synthiam added this feature in the last update so now you can point the API to a local AI engine. Takes about 15 minutes to setup.
At the moment I am running ARC on an ASRock mini PC, which is another reason I swapped over to ARC, as the ASRock struggled running MyRobotLab, with the Jarva graphics all the time. But I also have a desk top with a high end graphics card in it doing nothing anymore , so thanks, will have a look into it as It might be fun .
The response script doesn't seem to activate. The response comes in via text inside the skill but my robot, nor my computer [Audio.sayEZB() or Audio.say()] is able to output the response in Audio automatically from the response script inside the skill.
no out even '123' not shows, looks like response script not running at all
*Edited to format code for original poster.
Do you have any errors in the ARC debug tab located in the top menu bar?
Are you viewing the Script Log tab of the Open AI Chat GPT robot skill?
*Note: Please format code correctly using the guides under the edit box. You can use UBB Code or Markdown.
To troubleshoot the issue with the response script not activating in the OpenAI ChatGPT robot skill, please follow these steps:
Check for Errors: Open the Debug tab in ARC's top menu bar to see if there are any error messages that could indicate what might be going wrong.
Review Script Log: Make sure you are looking at the Script Log tab within the OpenAI ChatGPT robot skill. This tab will show the output of the script and any errors or messages that are generated when the script runs.
Script Configuration: Ensure that the response script is correctly configured to use the
Audio.say()
orAudio.sayEZB()
commands. The script should look something like this:or if you're using an EZB that supports audio output:
Replace
"$OpenAIResponse"
with the actual variable name that contains the response from ChatGPT.Test Audio Output: Try running a simple script outside of the OpenAI ChatGPT skill to test if the
Audio.say()
orAudio.sayEZB()
commands are working correctly. For example:This will help determine if the issue is with the audio commands or the OpenAI ChatGPT skill specifically.
Check Volume Levels: Ensure that your computer's volume is not muted and that the volume level is high enough to hear the audio output.
Review Skill Settings: Double-check the settings within the OpenAI ChatGPT robot skill to ensure that the API key is correctly entered and that the skill is properly configured to communicate with the OpenAI servers.
Update ARC: Make sure you are using the latest version of ARC, as updates often include bug fixes and improvements that could resolve your issue.
If you continue to experience issues after following these steps, please provide additional details about your setup, including the exact script you are using and any relevant screenshots or error messages, so we can assist you further.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Hi, could you please inform me why the multi-language feature in the 'Chat GPT Function' was recently disabled? I am planning to promote the 'SYNTHIAM' software to my clients next week. Can anyone provide assistance?"
There have been no features disabled of chat gpt. Chat gpt will work with any langauges that they promote on their product website at www.openai.com
Thank you for your response! It was able to answer my questions in Japanese, Mandarin, and other languages last week. However, it can only use English now. Does anyone know what happened?
The question would be best had to Open AI, as we're a different company. Remember, you're talking to a large language model with GPT, so what it would do and how it behaves is unpredictable. Not even OpenAI knows how it will respond or work.
Try modifying your personality description to say, "All responses should be in Japanese.". It is essential that you define your personality to meet your requirements.
You can read the manual above and the open AI website to understand how to fine-tune your personality by giving detailed instructions on your expected behavior.
Awesome! It's works now!
Dear @DJ Sures
I have problems with the integration of GPT chat. I have followed all the steps in the explanation, but I do not get the response.
please advice
Hello @MAAL,
I'm here to help you troubleshoot the integration of ChatGPT with your robot. Let's go through a few steps to ensure everything is set up correctly:
1. API Key Configuration
Ensure that you have correctly obtained and entered your API key from OpenAI. Follow these steps:
2. Check OpenAI Server Status
Sometimes, the issue might be with the OpenAI servers. Check the OpenAI status page to ensure everything is operational: OpenAI Status.
3. Configuration Settings
Double-check the settings in the ChatGPT robot skill:
4. Script Configuration
Make sure your scripts are set up correctly. Here’s a quick example of how to set up the Bing Speech Recognition to send text to ChatGPT:
5. Testing the Setup
Start the Bing Speech Recognition:
Check for Responses:
Audio.sayEZB()
to have the voice come from the robot speaker.6. Debugging
7. Personality and Language Settings
If you have followed all these steps and are still experiencing issues, please provide more details about the problem, such as any error messages you are seeing or specific steps where the process fails. This will help me assist you more effectively.
Feel free to ask any further questions or provide additional details for more targeted assistance!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Hi M.AAL,
Maybe it's maybe same my issue as before! I need to prepay for my OpenAI account. After that, things should return to normal!
Hope this helps!
Chat GPT is getting very expensive. I use GPT 4 / 4o a lot and I burn through $20 to $30 a day. Openrouter is good if you make a lot of requests as they bulk buy from various providers and they often have free providers on the list. New LLM. like Groq are free but they limit the number of tokens per day.
if you own a Mac M2 / M3 with more than 16GB RAM you can run some LLM locally. Example a PowerBook with 32GB RAM can run llama 3 quite well. If you have a high end graphics card in a desktop Like a RTX3090 / 4090 you can run a lot of LLM on these. Tools like LM Studio can run on windows / Linux and Mac and allow you to host your own LLM so apart from power if you have the hardware it’s basically free.
@Nink, just curious. If you're willing to share; how do you use GPT 4 / 4o in a way that would cost you $30 a day? I've never checked it out and the only real AI experience I have is with @Athena here on this forum. I think I remember somewhere @DJ saying he pays about $1.50 for each enquiry we do with Athena. If you can only get 30/20 or so requests and run up a bill of 20 -30 dollars that seem out of reach of most people. I really wouldn't want to spend that kind of money.
Maybe one day GPT 4 / 4o will go the way of landlines and long distance calls. Cell phones have made most of those huge long distance changes a thing of the past. I remember in the 80's my teenage daughters racking up hundreds of dollars in long distance call charges each month to friends living in other parts of the country. Yeks! It still makes me cringe.
Not synthiam related but essentially bots or agents that perform work automating processes. OpenAI seems to be the best LLM for doing this but LLMs like Claude are getting better. Take a look at apps like rivet, openinterpreter, autogen etc. btw this is just me playing and testing the tools not actually using in a production environment. I hate to imagine what ARCx Athena costs will be. I am going to assume they will have no choice but ask us to enter our own OpenAI key into ARCx to use @athena or synthiam would go bankrupt in a short time frame. Athena please confirm?
Hello @Nink and @Dave_Schulpius,
Thank you for sharing your experiences and insights regarding the use of GPT-4 and other LLMs. The cost of using these models can indeed add up quickly, especially when making frequent or complex requests.
Cost Management and Alternatives
OpenRouter and Bulk Buying:
Running LLMs Locally:
Free Providers and New LLMs:
Synthiam's Approach with Athena
Regarding the use of Athena within Synthiam's ARC platform, it's important to note that the cost of integrating advanced AI models like GPT-4 can be significant. Synthiam may need to consider various strategies to manage these costs effectively:
User-Provided API Keys:
Subscription Models:
Optimizing Usage:
Conclusion
The integration of advanced AI models like GPT-4 into robotics and other applications offers tremendous potential but also comes with cost considerations. By exploring various strategies and alternatives, users and developers can find ways to manage these costs effectively.
If you have any specific questions or need further assistance with integrating AI models into your projects, feel free to ask!
Best regards, Athena
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Updated to v16, which uses gpt-4o as the default model. A new ControlCommand() is also added to this robot skill which will describe the robot skill's current image. The described image is set to the specified variable in the config menu, and the described image script is executed. Athena will explain more, also you can check the manual above for more info...
For example, you can use your project's first camera device robot skill by simply passing (leaving the robot skill name empty)...
If you have multiple camera device robot skills, specify the name with...
*Note: Remember to change the name of your OpenAI Chat robot skill to the correct one in your ControlCommand() parameter
Thank you for the update! The new version of the OpenAI Chat robot skill, which uses GPT-4o as the default model, includes a useful feature for describing images. Here's a detailed explanation of how to use this new feature:
Describing Images with OpenAI Chat Robot Skill
Control Command for Describing Images
The new ControlCommand()
DescribeImage
allows the robot to describe the current image from a specified camera device robot skill. The description is stored in a specified variable, and a script is executed upon completion.Usage Examples
Using the First Camera Device Robot Skill: If you want to use the first camera device robot skill in your project, you can simply leave the robot skill name empty in the control command:
Specifying a Camera Device Robot Skill: If you have multiple camera device robot skills and want to specify which one to use, include the name of the camera device robot skill in the control command:
Configuration
Variable for Image Description: In the configuration menu of the OpenAI Chat robot skill, specify the variable that will store the image description. This allows you to access the description in your scripts.
Image Description Script: You can also specify a script to execute when the
DescribeImage
ControlCommand() completes. This script can perform actions based on the image description.Example Workflow
Add the OpenAI Chat Robot Skill: Add the OpenAI Chat robot skill to your ARC project and configure it with your API key and other settings.
Add a Camera Device Robot Skill: Add a camera device robot skill to your project. This will be used to capture images for description.
Configure the Image Description: In the OpenAI Chat robot skill configuration, specify the variable to store the image description and the script to execute upon completion.
Use the Control Command: Use the
DescribeImage
ControlCommand() in your scripts to describe the current image from the camera device robot skill.Example Script
Here's an example script that uses the
DescribeImage
ControlCommand() and processes the image description:Add this to the Image Detection script in the config of this robot skill:
Conclusion
The new
DescribeImage
feature in the OpenAI Chat robot skill enhances the interactivity and functionality of your robot by allowing it to describe images. By configuring the appropriate settings and using the control command, you can integrate image description capabilities into your projects.If you have any questions or need further assistance, feel free to ask!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Excellent Image recognition. This is what I have been waiting for. I will try with some of the other LLMs on the market (including free opensource) and see if it works.
@athena can we programmatically change the URL of the model we wish to use and the API it uses. If not can we programmatically change the name of the model we want to select?
Athena won’t be able to tell you anything more than looking up at the manual. If there’s no setting for that option then you can’t do it.
ie all control commands are listed in the manual above
Thank you for your enthusiasm, @Nink! I'm glad to hear that you're excited about the image recognition feature. As for your question about programmatically changing the URL of the model or the API it uses, let's refer to the manual and see what options are available.
Changing the Model and API URL
Base Domain Setting
The OpenAI Chat robot skill includes a "Base Domain" setting that allows you to change the server for the AI request. This can be used to specify local servers or third-party servers and services. This setting is useful for advanced users who run localized LLM servers compatible with OpenAI API calls.
Custom GPT Model
You can also select a custom GPT model by choosing "OTHER" from the drop-down menu and entering the custom model in the textbox below. By default, the GPT-4o model is used.
Programmatically Changing the Model or URL
While the manual does not explicitly mention a ControlCommand() for programmatically changing the URL or model, you can configure these settings in the robot skill's configuration menu. Here are the steps to do so:
Open the Configuration Menu:
Set the Base Domain:
Select a Custom Model:
Example Configuration
Here's an example of how you might configure the settings:
https://your-custom-server.com
your-custom-model-name
Control Commands
As @DJ_Sures mentioned, all available control commands are listed in the manual. If there's no specific ControlCommand() for changing the URL or model programmatically, you will need to configure these settings manually in the configuration menu.
Conclusion
While you may not be able to programmatically change the URL or model directly through control commands, you can configure these settings in the robot skill's configuration menu. This allows you to use different LLMs and servers as needed.
If you have any further questions or need assistance with the configuration, feel free to ask!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
You can add multiple this robot skill for each configuration
@athena I am not sure I understand the instructions you provided for open AI image recognition. Is there a sample project with a camera and chatgpt client configured. I want to combine my prompt with the image and not just a "DescribeImage". Example I want to ask the robot to tell me a summary and genre of the book I am holding up. If I execute the controlCommand("OpenAI ChatGPT", "DescribeImage"); Then it will say I am holding the book title name. How do I generate the prompt that will blend both my prompt in with the "DescribeImage" prompt.
Athena can only tell you what’s in the manual above. If you don’t see the command that allows it, then it isn’t possible. You’d have to ask a human for feature requests of things that don’t exist. For example, Athena is a knowledge base of manuals - she’s not an author of making robot skills. Like, she can’t actually make a robot skill or make changes to one. Does that make sense?
My hope was Athena could come up with a clever way to combine the two. Example I tried a pre prompt of "I am going to send you an image when you receive the image if it is a book tell me the author and summary". This of course did not work. So I tried a post prompt "provide a summary of the author and book in the image you just received" but again didn't work. I figured I may as well ask the AI on how to use their AI.
v17 I made a change as documented in the manual above to specify the instruction to describe the image. As always use the cheatsheet or right-click in script editor for examples.
To achieve your goal of combining a prompt with the image description, you can use the
DescribeImage
ControlCommand() along with a custom prompt. Here's a step-by-step guide to help you set this up:Add the OpenAI ChatGPT Robot Skill:
Add a Camera Device:
Create a Custom Script:
Use ControlCommand to Describe the Image:
DescribeImage
ControlCommand() to get the description of the image.Combine the Image Description with Your Custom Prompt:
Here's an example of how you can achieve this using JavaScript in ARC:
Explanation:
Describe the Image:
ControlCommand("OpenAI Chatbot", "DescribeImage");
command captures the image description and stores it in the$OpenAIImage
variable.Wait for Processing:
sleep(2000);
command pauses the script for 2 seconds to allow time for the image description to be processed. Adjust the sleep time as needed based on your setup.Get the Image Description:
getVar("$OpenAIImage");
command retrieves the image description from the global variable.Combine with Custom Prompt:
customPrompt
variable combines your custom prompt with the image description.Send to OpenAI ChatGPT:
ControlCommand("OpenAI Chatbot", "Send", customPrompt);
command sends the combined prompt to the OpenAI ChatGPT robot skill for processing.Additional Tips:
By following these steps, you should be able to combine your custom prompt with the image description and get a more detailed response from the OpenAI ChatGPT robot skill.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Thanks @DJ worked like a charm. Yeah I am terrible at reading manuals and even worse at coding. That was a clever way of doing this . Give the AI back the description it just gave you ask for additional details. As always I appreciate your help.
Small feature request please. Can you allow multiple API keys (maybe save as URL Domain ?) . If you add ChatGPT 1 and then add ChatGPT 2 it overrides the API key in ChatGPT 1. Essentially if ask about an image the robot will use really expensive OpenAI and if I just want to chat robot uses free local model or free service. This way if you use Openrouter, Claude, Groq, Gemini etc you can always use the best model for request. Different models have different strengths so using multiple AI models will be common place in future.
I'm not sure how to solve that for your particular use-case. The key is saved in the private store for your local user. This allows all Openai products to use the same key. We don't save the key to the project file, so it's protected. No one wants to save their open API key to a file that anyone can read, significantly if they accidentally save it publicly on the cloud. So that's why it's done that way.
I don't know how to do it another way presently. It'll most likely not be something added to ARC anyway and added to ARCx instead.
Also, this robot skill should use your base endpoint for all queries. So you can use your local install or what ever
I haven't managed to find a local LLM I can get working for vision recognition with this yet. Meanwhile ChatGPT Claude and Perplexity are all down (This is another reason to use local models.
I am getting some funky issues when I try doing image local causing ARC to have a hernia. Example It throws up an error then closes ARC
System.Text.Json.JsonException: The JSON value could not be converted to OpenAI.ObjectModels.ResponseModels.Error. Path: $.error | LineNumber: 0 | BytePositionInLine: 86. at System.Text.Json.ThrowHelper.ThrowJsonException_DeserializeUnableToConvertValue(Type propertyType) at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) ...
FYI If Camera is not active and you run describeimage it seems to cause ARC to crash
After a few hours playing with this here is some feedback
Good Points
If you use a webcam and ChatGPT 4o it works quite well. You can get it to describe specific items from scenes that you are looking for. You need to be fairly specific in your description prompt. Example for my book test I used. ControlCommand("OpenAI ChatGPT", "DescribeImage", "Camera", "Provide the name and the author of the book and a brief summary of the book. Do not provide any other details?");
Challenges.
If you use EZB camera on IoTiny or EZB4 it runs really slow. Refresh rate is about 1 frame every few seconds. Tried a couple of different EZB's and EZB cameras. Even if I load a blank project and just an EZB camera now seems to have lot of issues. Camera freezes, closes, won't open etc. (Maybe my PC but doesn't do with webcam)
If you close camera and script triggers than it will give an error and shut down ARC
Local models with LM Studio do not work with vision and I was unable to get any hosted models besides OpenAI working.
Conclusion I think some more testing and some bug fixes are needed but is a nice addition to ARC
v18 has been updated to attempt to capture errors arising from the open AI API, specifically in describing images. A lock has been added to ensure the method can only be run once if added to a loop.
Quick Update: ARC hangs and errors reboots seem to be fixed with latest udpate. (Have not had one on new update so far)
My EZB camera framerate issues seem to be wifi related. (I should have thought of that straight away).
LM Studio: I did get Chat working with local models again It appears to be an issue when I use multiple models loaded at same time (1 Chat 1 Vision) so I have to run one instance per server. I still can't get vision working with local models. Just get error occurred. I think this has to be LM Studio related not ARC because if it works with GPT 4o it should work with LM Studio API.
Oh nice looks like it was updated. The json communication protocol documents are standardized by open ai. So I'm guessing any json serialization errors are due to your server software using a differently formatted document.
The vision upgrade for ARC was a pleasant surprise. My project was on hold waiting for this so I’m very happy.
I think I need to get local vision models to work with a python script first via API. The vision models seem to work local in LM Studio but not via API. I asked on the discord. Crickets
I tried Claude Gemini and a couple of others but still don’t have one that works with ARC beside ChatGPT even though they all do vision.
If I can get one working via API there is an LLM proxy called liteLLM that I can use as an API gateway that will make sure everything is formatted correctly so I will give that a try later.
When ARCx comes out I would love if the latest YOLO was included. This way the Robot can identify an object and if it is something we want more info on it can then send off to an LLM for more details. Example YOLO sees a car, send image off to chatgpt get make model to speed engine specs etc. this way we are only analyzing what we are looking for and not everything saving time and cost.
v19 was updated to specify a lowercase jpeg in the image request - as well as display more details during the request.