Germany
Asked

Idea To Run Our Own GPT

We need to do it on our own computer so as not to use third-party services. For example:  https://huggingface.co/

OpenChat https://huggingface.co/openchat/openchat_3.5    demo: https://openchat.team/

DeepSeek Coder https://github.com/deepseek-ai/deepseek-coder demo: https://chat.deepseek.com/coder

LLaVA: Large Language and Vision Assistant https://github.com/haotian-liu/LLaVA demo: https://llava.hliu.cc/

gguf model 13B: https://huggingface.co/mys/ggml_llava-v1.5-13b gguf model 7B: https://huggingface.co/jartine/llava-v1.5-7B-GGUF


ARC Pro

Upgrade to ARC Pro

Subscribe to ARC Pro, and your robot will become a canvas for your imagination, limited only by your creativity.

PRO
Synthiam
#17  

Yeah, that's a late night for you! But also what weekends are for:D

It would be wild if they created an API and you got some answers. Someone with a hefty GPU could even host an ARC community server. But, given how slow Openai is, I'm sure we wouldn't be much faster. My GPUs are only 1080's for playing VR

#18  

We need to install software on the robot's computer. Use without internet or router. All teams are direct without intermediaries.

In this case, one computer - one user. I think any computer can serve one user efficiently and quickly.

OpenAI is slow because one computer still receives a million requests from a million users. Then the signal travels a very long way through a huge number of providers, routers, hackers, spies, and sometimes forgets where it goes altogether, thanks to our modern communication lines and Internet providers.

cost and size of hardware - I think those who were able to start making robots have enough money to buy the necessary equipment, and the sizes of modern and powerful computers are smaller than the sizes of mini computers from the past.

PRO
Canada
#19  

Latency, eavesdropping, outages, capacity limitations are all concerns with any centralized technology including AI. The recent phenomenon with OpenAI actually becoming lazy as a countermeasure to overutilization through resource optimization is also interesting.  The AI appears to be trying to ascertain if some users will accept a read the friendly manual response when others need detailed explanation. This has created a lazy AI model and OpenAI don’t know how to resolve it.

My interest in localized AI is around data and training protection. Data is king and you need to be able to protect your intellectual property. Uploading it to an AI engine is essentially giving that knowledge away. As models become trained and their knowledge is integrated into this behemoth AI knowledge data base you lose your unique capabilities, now anyone can do it.  You also risk losing control of your own knowledge and open yourself up to be easily extorted by the AI providers.

Synthiam has done a lot of work training Athena and pay a few cents when we ask questions of her. What happens when they increase the price from a few cents a response to dollars a response.  Do they keep paying?  Tools like Athena need to be set free from the grips of their AI masters and run locally by their owners.  Keep your data, AI, models, training and intellectual property in house don’t pay someone else to take it.

We need the same capability with robotics. If we train a robot to do a task like wash the car or do the dishes we now have a unique capability with a competitive advantage. If others can do it as well because we gave up our training data to a 3rd party we no longer have a unique business model. If we do maintain an AI advantage our model can be held hostage by the provider who takes an ever increasing share of our revenue.

PRO
Synthiam
#20   — Edited

I don’t think it’s environmentally feasible to have localized gpt’s.

For example, training a model takes significant energy. GPT-4 consumed between 51,773 MWh and 62,319 MWh, over 40 times higher than what its predecessor, GPT-3, consumed. This is equivalent to the energy consumption over 5 to 6 years of 1,000 average US households. Consider how many training attempts are required to fine tune your use case that performs within your specification.

A custom model is useful for your application until you have to make an alteration.

Also, it’s not the model you’re querying. Sure you can fine tune vectors to prioritize within the model. But embedding in an existing model helps prioritize vectors. So the existing model data is required for embedding to be useful.

What I mean is that the LLM you compile is only a dictionary for the content. The interaction requires nlp’s and vector embedding. So that increases the llm size to accommodate your requirements.

lastly, consider OpenAI’s hardware vs yours. Sure you can get a decent gaming GPU for $1,000+ but that’s minuscule to what open ai servers are using. You’re not sharing a single server with other users with open ai. You’re sharing tens of thousands of servers.

remember the announcement of gpt3 when Microsoft got involved and invested $1b+ azure credits? A single gpt3 server had 10 nvidia 1000 80gb gpus. And there were thousands of those servers, and that was only gpt3

I don’t know if people fully comprehend the scale of which gpt4 operates at a hardware level. Home diy LLM’s are fun and all, to learn the tech, but they’re nowhere near capable.

in short, I’m saying cloud AI will exist until there’s an affordable localized solution. I’m doubtful if that will happen. Primarily because the next ai processors will be bio or quantum. Edge device technology doesn’t advance as quickly as commercial cloud infrastructures because it would be too expensive for us. Sharing computing resources that we normally couldn’t afford is the only viable option.

And on cost. I think it’s affordable today because it’s new and OpenAI is aggressive for adoption. They’ve essentially set the expectation price. Competition will need to float around the similar price. It’s doubtful it’ll rise if it’s being consumed as much as we think it will.

PRO
Canada
#21  

I have been exploring LocalAI, this is an OpenAI compatible API that works with a number of different LLM models.  In addition to an OpenAI compatible API that provides access to a local model via an external chat engine,  it also has an inference engine that allows it to select the appropriate Large Language Model for handling your request.  This means you don't need a 130 Billion parameters and 700GB of RAM it will load the appropriate LLAMA model depending on the query you make.

LocalAI is packaged in a Kubernetes container and can be installed on a windows computer with or without a GPU. This is a unique approach as an API in conjunction with a strong inference model can select the appropriate LLM by utilizing an architecture that potentially could provide GPT 3.5 Level of capabilities and performance in conjunction with custom domain specific models (like Athena) on consumer grade hardware.

Today we have significant processing power in the average consumers home on platforms like gaming PC's and consoles like XBOX and PS5. I am confident companies will be looking at how to exploit these local platforms using approaches similar to  LocalAI without needing to utilize expensive centralized services.

https://localai.io/

Here is a view of the Architecture.

User-inserted image

PRO
Synthiam
#22  

Interesting stuff. Moving Athena from the cloud locally will eventually make sense if/when it reaches a point where the cost is too high relative to local computing infrastructure + electricity utilities. The thing is, companies operate servers in the cloud for regional accessibility, security, and of course, scalability. So, it's not a deal-breaker for companies to rely on cloud services. That's why we're not seeing enterprise-quality development of these AI services for localization.

Now - would I like to have Athena run locally? Sure. I want everything local, but I'd rather not have to hire IT staff that maintain servers, security patches, reboots, hardware failures, networking, firewalls, and internet providers. Ugh! It gives me flashbacks to the early 2000s when everything was local. Or should I say LOCO? Haha

PRO
Synthiam
#23   — Edited

Keep updating me on what you think is solid enough for a robot skill integration. It would make sense to at least attempt an integration with something that will stick around long enough to make sense for the investment. Unlike an enterprise company pushing hardware or a service, open-source projects don't pay Synthiam for integration, so it's a cost out of our budget. Internal developers (or me) generally volunteer for anything open source that we do. To keep you guys all happy:)

PRO
Synthiam
#24  

This looks interesting. The video guy is difficult to watch because his style is to read what is on the screen,. which takes forever - so by fast-forwarding through it, you can skip to the meat. But it looks like an interesting model...