
FANT0MAS
Germany
Asked
We need to do it on our own computer so as not to use third-party services. For example: https://huggingface.co/
OpenChat https://huggingface.co/openchat/openchat_3.5 demo: https://openchat.team/
DeepSeek Coder https://github.com/deepseek-ai/deepseek-coder demo: https://chat.deepseek.com/coder
LLaVA: Large Language and Vision Assistant https://github.com/haotian-liu/LLaVA demo: https://llava.hliu.cc/
gguf model 13B: https://huggingface.co/mys/ggml_llava-v1.5-13b gguf model 7B: https://huggingface.co/jartine/llava-v1.5-7B-GGUF
Since all of the architectures seem to have adopted OpenAI API I guess all we really need to worry about for now is having a chat bot that supports OpenAI API (we have that now) and the ability to change the url that points to where the API resides.
Google AutoRT looks interesting. Designed to use AI to coordinate tasks of up to 20 robots. AutoRT blog
Seeing your other comment from December, maybe you missed this. But you can add the base domain in the chat got skill..
Oh wow I did miss that. I will test it out thanks.
WOW IT WORKS !!!
Install LM Studio Follow the instructions in in the LM STUDIO Video below until you get to the copy API key point Then Paste the following in as your Base Domain http://localhost:1234/v1 And it works Ok I don't have a $3000 Nvidia 4090 and I can't get my old AMD GPU to work so it took a while for an answer to come back but we appear to have a solution for local API models using LM Studio. Now who wants to contribute to my 4090 gofundme
@Nink I'll have to give this a shot. Have you had much experience testing it yet? Specifically, does it handle embeddings and large token submissions?
I haven't had a chance to play around yet. Just posted some quick chat questions then was trying to get some other LLMs to work. Looks like it handles up to 32768 tokens but I set mine at 4096 as no GPU Seems to handle pre prompts and an override works so it knows it is a sarcastic robot called synthiam you maybe able to provide a path to an image but haven't tried yet.
Looks like you can buy 3 second hand RTX 3090 for the price of 1 new RTX 4090 so I think I will go 3090 route for GPU’s as you can also run mixtral if you have 2 or more cards and 32GB RAM available. (although I can’t test if mixtral works with ARC until I have cards). With mixtral hopefully with a 2 card setup we will get similar performance and AI capabilities as chat gpt 3.5