Germany
Asked

Idea To Run Our Own GPT

We need to do it on our own computer so as not to use third-party services. For example:  https://huggingface.co/

OpenChat https://huggingface.co/openchat/openchat_3.5    demo: https://openchat.team/

DeepSeek Coder https://github.com/deepseek-ai/deepseek-coder demo: https://chat.deepseek.com/coder

LLaVA: Large Language and Vision Assistant https://github.com/haotian-liu/LLaVA demo: https://llava.hliu.cc/

gguf model 13B: https://huggingface.co/mys/ggml_llava-v1.5-13b gguf model 7B: https://huggingface.co/jartine/llava-v1.5-7B-GGUF


ARC Pro

Upgrade to ARC Pro

Elevate your robot's capabilities to the next level with Synthiam ARC Pro, unlocking a world of possibilities in robot programming.

PRO
Canada
#33  

Since all of the architectures seem to have adopted OpenAI API I guess all we really need to worry about for now is having a chat bot that supports OpenAI API (we have that now) and the ability to change the url that points to where the API resides.

PRO
Canada
#34  

Google AutoRT looks interesting. Designed to use AI to coordinate tasks of up to 20 robots.  AutoRT blog

PRO
Synthiam
#35  

Seeing your other comment from December, maybe you missed this. But you can add the base domain in the chat got skill..

User-inserted image

PRO
Canada
#36  

Oh wow I did miss that. I will test it out thanks.

PRO
Canada
#37  

WOW IT WORKS !!!

Install LM Studio  Follow the instructions in in the LM STUDIO Video below until you get to the copy API key point Then Paste the following in as your Base Domain http://localhost:1234/v1  And it works  Ok I don't have a $3000 Nvidia 4090 and I can't get my old AMD GPU to work so it took a while for an answer to come back but we appear to have a solution for local API models using LM Studio.   Now who wants to contribute to my 4090 gofundme:D

PRO
Synthiam
#38  

@Nink I'll have to give this a shot. Have you had much experience testing it yet? Specifically, does it handle embeddings and large token submissions?

PRO
Canada
#39  

I haven't had a chance to play around yet. Just posted some quick chat questions then was trying to get some other LLMs to work. Looks like it handles up to 32768 tokens but I set mine at 4096 as no GPU  Seems to handle  pre prompts and an override works so it knows it is a sarcastic robot called synthiam you maybe able to provide a path to an image but haven't tried yet.

2024-01-06 21:14:20.472] [INFO] [LM STUDIO SERVER] Context Overflow Policy is: Rolling Window
[2024-01-06 21:14:20.473] [INFO] Provided inference configuration: {
"n_threads": 4,
"n_predict": -1,
"top_k": 40,
"min_p": 0.05,
"top_p": 0.95,
"temp": 0.5,
"repeat_penalty": 1.1,
"input_prefix": "### Instruction:\n",
"input_suffix": "\n### Response:\n",
"antiprompt": [
"### Instruction:"
],
"pre_prompt": "Your name is Synthiam and you're a sarcastic robot that makes jokes. You can move around, dance, laugh and tell jokes. You have a camera to see people with. Your responses are 1 or 2 sentences only.",
"pre_prompt_suffix": "\n",
"pre_prompt_prefix": "",
"seed": -1,
"tfs_z": 1,
"typical_p": 1,
"repeat_last_n": 64,
"frequency_penalty": 0,
"presence_penalty": 0,
"n_keep": 0,
"logit_bias": {},
"mirostat": 0,
"mirostat_tau": 5,
"mirostat_eta": 0.1,
"memory_f16": true,
"multiline_input": false,
"penalize_nl": true
}
[2024-01-06 21:14:20.473] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'what is your name' } (total messages = 18)
[2024-01-06 21:14:35.435] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
[2024-01-06 21:14:35.436] [INFO] Accumulated 1 tokens: My
[2024-01-06 21:14:35.611] [INFO] Accumulated 2 tokens: My name
[2024-01-06 21:14:35.771] [INFO] Accumulated 3 tokens: My name is
[2024-01-06 21:14:35.931] [INFO] Accumulated 4 tokens: My name is Syn
[2024-01-06 21:14:36.091] [INFO] Accumulated 5 tokens: My name is Synth
[2024-01-06 21:14:36.251] [INFO] Accumulated 6 tokens: My name is Synthiam
[2024-01-06 21:14:36.411] [INFO] Accumulated 7 tokens: My name is Synthiam.
[2024-01-06 21:14:36.619] [INFO] [LM STUDIO SERVER] Generated prediction: {
"id": "chatcmpl-fxleyektj7urii8xdnpi",
"object": "chat.completion",
"created": 1704593660,
"model": "C:\\Users\\peter\\.cache\\lm-studio\\models\\TheBloke\\dolphin-2.2.1-mistral-7B-GGUF\\dolphin-2.2.1-mistral-7b.Q5_K_M.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "My name is Synthiam."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 377,
"completion_tokens": 7,
"total_tokens": 384
}
}
[2024-01-06 21:14:36.623] [INFO] [LM STUDIO SERVER] Processing queued request...
[2024-01-06 21:14:36.624] [INFO] Received POST request to /v1/chat/completions with body: {
"messages": [
{
"role": "system",
"content": "What phrase in the list below best describes the sentence? If you don't know, respond with 'I don't know'"
},
{
"role": "user",
"content": "Sentence: My name is Synthiam."
},
{
"role": "user",
"content": "Option: Dance"
},
{
"role": "user",
"content": "Option: Sad"
},
{
"role": "user",
"content": "Option: Happy"
},
{
"role": "user",
"content": "Option: Navigate to kitchen"
},
{
"role": "user",
"content": "Option: Laugh"
},
{
"role": "user",
"content": "Option: Exercise"
}
],
"model": "LM Studio",
"temperature": 0
}
[2024-01-06 21:14:36.624] [INFO] [LM STUDIO SERVER] Context Overflow Policy is: Rolling Window
[2024-01-06 21:14:36.624] [INFO] Provided inference configuration: {
"n_threads": 4,
"n_predict": -1,
"top_k": 40,
"min_p": 0.05,
"top_p": 0.95,
"temp": 0,
"repeat_penalty": 1.1,
"input_prefix": "### Instruction:\n",
"input_suffix": "\n### Response:\n",
"antiprompt": [
"### Instruction:"
],
"pre_prompt": "What phrase in the list below best describes the sentence? If you don't know, respond with 'I don't know'",
"pre_prompt_suffix": "\n",
"pre_prompt_prefix": "",
"seed": -1,
"tfs_z": 1,
"typical_p": 1,
"repeat_last_n": 64,
"frequency_penalty": 0,
"presence_penalty": 0,
"n_keep": 0,
"logit_bias": {},
"mirostat": 0,
"mirostat_tau": 5,
"mirostat_eta": 0.1,
"memory_f16": true,
"multiline_input": false,
"penalize_nl": true
}
[2024-01-06 21:14:36.624] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'Option: Exercise' } (total messages = 8)
[2024-01-06 21:14:40.568] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
[2024-01-06 21:14:40.569] [INFO] Accumulated 1 tokens: I
[2024-01-06 21:14:40.728] [INFO] Accumulated 2 tokens: I don
[2024-01-06 21:14:40.888] [INFO] Accumulated 3 tokens: I don'
[2024-01-06 21:14:41.047] [INFO] Accumulated 4 tokens: I don't
[2024-01-06 21:14:41.207] [INFO] Accumulated 5 tokens: I don't know
[2024-01-06 21:14:41.398] [INFO] [LM STUDIO SERVER] Generated prediction: {
"id": "chatcmpl-ykb8big5h7lz5c8r1xxz4",
"object": "chat.completion",
"created": 1704593676,
"model": "C:\\Users\\peter\\.cache\\lm-studio\\models\\TheBloke\\dolphin-2.2.1-mistral-7B-GGUF\\dolphin-2.2.1-mistral-7b.Q5_K_M.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I don't know"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 102,
"completion_tokens": 5,
"total_tokens": 107
}
}
PRO
Canada
#40  

Looks like you can buy 3 second hand RTX 3090 for the price of 1 new RTX 4090 so I think I will go 3090 route for GPU’s as you can also run mixtral if you have 2 or more cards and 32GB RAM available. (although I can’t test if mixtral works with ARC until I have cards). With mixtral hopefully with a 2 card setup we will get similar performance and AI capabilities as chat gpt 3.5