Asked

OK great I noticed GPT4o says give me any image in any format and I will work it out where everyone else wants base64
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"detail": "high"
},
},
],
}
],
max_tokens=300,
)
print(response.choices[0].message.content)
Curious is the image sent as a JPG PNG etc or is it converted to base64 and sent as a text file It looks like LM Studio will only take a photo in Base64 format when GPT4o will take a png or jpg
jpeg binary encoded to ascii via base64 (open ai specification)
That example is for a url - which you do not have a web server. If you hosted a web server on the internet with images, you could use that example. Instead, the proper usage is base 64 encoding the binary and including it with the message.
Additionally, the message json is assembled by open ai’s api. The message is not formatted and created by the robot skill, as it’s using their sdk api for their standard. Because the message works with open ai, we can assume the third party system that you’re using has issues.
I think this conversation is starting to get off topic as it's about third party products using the same open ai protocol. I'll make a new thread for it
Okay here we go.... Let me see. This is how the image is sent using the SDK for the Open AI API...
Like synthiam support says - there's no way the JSON is "Created manually by the robot skill". The API has a specification for the JSON format, and the SDK fulfills that specification; both are by Open AI. The output of the SDK will be a formatted document that the Open AI API requires.
If you're using a third-party product that claims to be compatible with Open AI, I'd challenge them that something isn't compatible.
OK thanks I don't know C# but looking at the code it appears it is sending this as a binary encoded image and not a base64 image that my tool wants to receive.
OpenAI Python Example Base64
https://platform.openai.com/docs/guides/vision
No - it means taking an IMAGE in BINARY FORMAT. It's essentially the same command that your Python is showing. Python is a different language so that the commands will be different. Also, it appears that Python isn't using an open AI SDK for the API.
This is the OPEN AI command code you're asking about.
I asked the open ai robot skill to be updated i noticed it was using uppercase JPEG and should be lower case although that shouldn't matter. but maybe your open source thing does.
IE it was
"data:image/JPEG;base64,{1}",
and is now
"data:image/jpeg;base64,{1}",
shrug