Idea to run our own GPT

FANT0MAS

Germany

Asked Nov 2023

Skip to comments Jump to end

We need to do it on our own computer so as not to use third-party services. For example: https://huggingface.co/

OpenChat https://huggingface.co/openchat/openchat_3.5 demo: https://openchat.team/

DeepSeek Coder https://github.com/deepseek-ai/deepseek-coder demo: https://chat.deepseek.com/coder

LLaVA: Large Language and Vision Assistant https://github.com/haotian-liu/LLaVA demo: https://llava.hliu.cc/

gguf model 13B: https://huggingface.co/mys/ggml_llava-v1.5-13b gguf model 7B: https://huggingface.co/jartine/llava-v1.5-7B-GGUF

Jump to end

Upgrade to ARC Pro

Stay on the cutting edge of robotics with ARC Pro, guaranteeing that your robot is always ahead of the game.

Compare Pro Features View Subscription Plans

Nink

PRO

Canada

#33 Dec 2023

Since all of the architectures seem to have adopted OpenAI API I guess all we really need to worry about for now is having a chat bot that supports OpenAI API (we have that now) and the ability to change the url that points to where the API resides.

Nink

PRO

Canada

#34 Jan 2024

Google AutoRT looks interesting. Designed to use AI to coordinate tasks of up to 20 robots. AutoRT blog

DJ Sures

PRO

Synthiam

#35 Jan 2024

Seeing your other comment from December, maybe you missed this. But you can add the base domain in the chat got skill..

Nink

PRO

Canada

#36 Jan 2024

Oh wow I did miss that. I will test it out thanks.

Nink

PRO

Canada

#37 Jan 2024

WOW IT WORKS !!!

Install LM Studio Follow the instructions in in the LM STUDIO Video below until you get to the copy API key point Then Paste the following in as your Base Domain http://localhost:1234/v1 And it works Ok I don't have a $3000 Nvidia 4090 and I can't get my old AMD GPU to work so it took a while for an answer to come back but we appear to have a solution for local API models using LM Studio. Now who wants to contribute to my 4090 gofundme

DJ Sures

PRO

Synthiam

#38 Jan 2024

@Nink I'll have to give this a shot. Have you had much experience testing it yet? Specifically, does it handle embeddings and large token submissions?

Nink

PRO

Canada

#39 Jan 2024

I haven't had a chance to play around yet. Just posted some quick chat questions then was trying to get some other LLMs to work. Looks like it handles up to 32768 tokens but I set mine at 4096 as no GPU Seems to handle pre prompts and an override works so it knows it is a sarcastic robot called synthiam you maybe able to provide a path to an image but haven't tried yet.

2024-01-06 21:14:20.472] [INFO] [LM STUDIO SERVER] Context Overflow Policy is: Rolling Window
[2024-01-06 21:14:20.473] [INFO] Provided inference configuration: {
"n_threads": 4,
"n_predict": -1,
"top_k": 40,
"min_p": 0.05,
"top_p": 0.95,
"temp": 0.5,
"repeat_penalty": 1.1,
"input_prefix": "### Instruction:\n",
"input_suffix": "\n### Response:\n",
"antiprompt": [
"### Instruction:"
],
"pre_prompt": "Your name is Synthiam and you're a sarcastic robot that makes jokes. You can move around, dance, laugh and tell jokes. You have a camera to see people with. Your responses are 1 or 2 sentences only.",
"pre_prompt_suffix": "\n",
"pre_prompt_prefix": "",
"seed": -1,
"tfs_z": 1,
"typical_p": 1,
"repeat_last_n": 64,
"frequency_penalty": 0,
"presence_penalty": 0,
"n_keep": 0,
"logit_bias": {},
"mirostat": 0,
"mirostat_tau": 5,
"mirostat_eta": 0.1,
"memory_f16": true,
"multiline_input": false,
"penalize_nl": true
}
[2024-01-06 21:14:20.473] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'what is your name' } (total messages = 18)
[2024-01-06 21:14:35.435] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
[2024-01-06 21:14:35.436] [INFO] Accumulated 1 tokens: My
[2024-01-06 21:14:35.611] [INFO] Accumulated 2 tokens: My name
[2024-01-06 21:14:35.771] [INFO] Accumulated 3 tokens: My name is
[2024-01-06 21:14:35.931] [INFO] Accumulated 4 tokens: My name is Syn
[2024-01-06 21:14:36.091] [INFO] Accumulated 5 tokens: My name is Synth
[2024-01-06 21:14:36.251] [INFO] Accumulated 6 tokens: My name is Synthiam
[2024-01-06 21:14:36.411] [INFO] Accumulated 7 tokens: My name is Synthiam.
[2024-01-06 21:14:36.619] [INFO] [LM STUDIO SERVER] Generated prediction: {
"id": "chatcmpl-fxleyektj7urii8xdnpi",
"object": "chat.completion",
"created": 1704593660,
"model": "C:\\Users\\peter\\.cache\\lm-studio\\models\\TheBloke\\dolphin-2.2.1-mistral-7B-GGUF\\dolphin-2.2.1-mistral-7b.Q5_K_M.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "My name is Synthiam."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 377,
"completion_tokens": 7,
"total_tokens": 384
}
}
[2024-01-06 21:14:36.623] [INFO] [LM STUDIO SERVER] Processing queued request...
[2024-01-06 21:14:36.624] [INFO] Received POST request to /v1/chat/completions with body: {
"messages": [
{
"role": "system",
"content": "What phrase in the list below best describes the sentence? If you don't know, respond with 'I don't know'"
},
{
"role": "user",
"content": "Sentence: My name is Synthiam."
},
{
"role": "user",
"content": "Option: Dance"
},
{
"role": "user",
"content": "Option: Sad"
},
{
"role": "user",
"content": "Option: Happy"
},
{
"role": "user",
"content": "Option: Navigate to kitchen"
},
{
"role": "user",
"content": "Option: Laugh"
},
{
"role": "user",
"content": "Option: Exercise"
}
],
"model": "LM Studio",
"temperature": 0
}
[2024-01-06 21:14:36.624] [INFO] [LM STUDIO SERVER] Context Overflow Policy is: Rolling Window
[2024-01-06 21:14:36.624] [INFO] Provided inference configuration: {
"n_threads": 4,
"n_predict": -1,
"top_k": 40,
"min_p": 0.05,
"top_p": 0.95,
"temp": 0,
"repeat_penalty": 1.1,
"input_prefix": "### Instruction:\n",
"input_suffix": "\n### Response:\n",
"antiprompt": [
"### Instruction:"
],
"pre_prompt": "What phrase in the list below best describes the sentence? If you don't know, respond with 'I don't know'",
"pre_prompt_suffix": "\n",
"pre_prompt_prefix": "",
"seed": -1,
"tfs_z": 1,
"typical_p": 1,
"repeat_last_n": 64,
"frequency_penalty": 0,
"presence_penalty": 0,
"n_keep": 0,
"logit_bias": {},
"mirostat": 0,
"mirostat_tau": 5,
"mirostat_eta": 0.1,
"memory_f16": true,
"multiline_input": false,
"penalize_nl": true
}
[2024-01-06 21:14:36.624] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'Option: Exercise' } (total messages = 8)
[2024-01-06 21:14:40.568] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
[2024-01-06 21:14:40.569] [INFO] Accumulated 1 tokens: I
[2024-01-06 21:14:40.728] [INFO] Accumulated 2 tokens: I don
[2024-01-06 21:14:40.888] [INFO] Accumulated 3 tokens: I don'
[2024-01-06 21:14:41.047] [INFO] Accumulated 4 tokens: I don't
[2024-01-06 21:14:41.207] [INFO] Accumulated 5 tokens: I don't know
[2024-01-06 21:14:41.398] [INFO] [LM STUDIO SERVER] Generated prediction: {
"id": "chatcmpl-ykb8big5h7lz5c8r1xxz4",
"object": "chat.completion",
"created": 1704593676,
"model": "C:\\Users\\peter\\.cache\\lm-studio\\models\\TheBloke\\dolphin-2.2.1-mistral-7B-GGUF\\dolphin-2.2.1-mistral-7b.Q5_K_M.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I don't know"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 102,
"completion_tokens": 5,
"total_tokens": 107
}
}

Nink

PRO

Canada

#40 Jan 2024

Looks like you can buy 3 second hand RTX 3090 for the price of 1 new RTX 4090 so I think I will go 3090 route for GPU’s as you can also run mixtral if you have 2 or more cards and 32GB RAM available. (although I can’t test if mixtral works with ARC until I have cards). With mixtral hopefully with a 2 card setup we will get similar performance and AI capabilities as chat gpt 3.5

FANT0MAS

Idea To Run Our Own GPT

Upgrade to ARC Pro

Products

Community

Support

About