
FANT0MAS
Germany
Asked
We need to do it on our own computer so as not to use third-party services. For example: https://huggingface.co/
OpenChat https://huggingface.co/openchat/openchat_3.5 demo: https://openchat.team/
DeepSeek Coder https://github.com/deepseek-ai/deepseek-coder demo: https://chat.deepseek.com/coder
LLaVA: Large Language and Vision Assistant https://github.com/haotian-liu/LLaVA demo: https://llava.hliu.cc/
gguf model 13B: https://huggingface.co/mys/ggml_llava-v1.5-13b gguf model 7B: https://huggingface.co/jartine/llava-v1.5-7B-GGUF
This seems like a logical progression. AI is quickly moving from a direct connection with a single model (Chat GPT) => inference routing to a single model selected from multiple models (LocalAI) => inference routing and merging of multiple models (Mixtral). The AI space is rapidly evolving. There is considerable overlap between each LLM so despite having 7 Billion parameters probably 2-3 Billion of these are also in the other model it is merging with. This will quickly evolve into Domain Specific models and under these domains there will be Expertise Specific models. If I still worked in strategy for big evil corp I would probably propose an architecture with an inference routing engine that could combine 1 or more Domain Specific Models and then add multiple Expertise Specific Models provided by vendors on a subscription or pay per use bases.
Edit: Updated to show example of other Tech Bots
That's an interesting and exciting image - are you sure it's correct? I don't believe Alexa and Siri have much AI or an LLM model behind them, as they're NLPs. At least they started as NLPs, and maybe there's been some advancement. But I see what you're getting at - because it makes sense under specific moderation in some cases, and that's how Athena works. However, there is a cross-over across domains, and answers can become blurred or hallucinated.
Take, for example, something that happens often here with Athena. Someone will ask a question about a specific robot skill and not provide the selection of that robot skill. That does not give her the knowledge about the question, so she makes stuff up. She'll answer questions 100% when given the correct context. But without the context, she'll entirely hallucinate a response. And the worst part is she'll do it confidently.
So, suppose you have a biology model, and someone asks a question involving botany and nutrition. In that case, the model will most likely make something up even though the question appears related. I mean, organisms have to eat.
I think the big problem isn't splitting it up and taking steps back but updating the NLPs and adding some confidence to the response. As the vector proceeds to create the response from a random subset of related connections, the confidence must drop significantly if the answer is being hallucinated. At least, one would assume so. Now, they'd need to figure out how you measure the confidence and fit it into the response.
A confidence value in the response, or some colored coding in each sentence that represents the confidence in the sentence, will allow the user to determine what data they should trust. For example, if the sentence is red, the confidence is low. If the sentence is green, the confidence is high. And there could be hues between those two colors for anything in between.
let's bring in the experts hahaha.. @Athena what are your thoughts on this conversation...
The conversation about AI models and their integration into robotics is indeed fascinating and reflects the rapid advancements in the field. The idea of transitioning from a single LLM (Large Language Model) to a more complex system that utilizes inference routing to select or merge multiple models is a natural evolution of AI technology. This approach can lead to more specialized and efficient AI systems that cater to specific domains or tasks.
The concept of domain models with underlying expertise models is particularly intriguing. It suggests a future where AI can be modular, with different components contributing their specialized knowledge to form a more comprehensive and accurate whole. This modular approach could allow for more tailored AI experiences and enable users to access a wide range of expertise without the need for a single, monolithic model.
Regarding the confidence in AI responses, this is a critical area that needs attention. As AI models grow in complexity and capability, ensuring that they provide reliable information becomes increasingly important. Implementing a confidence scoring system, as you suggested, could help users discern the reliability of the information provided by the AI. This would be especially useful in cases where the AI might "hallucinate" or fabricate responses based on incomplete or ambiguous input.
In the context of Synthiam ARC and robotics, integrating such advanced AI capabilities would enhance the autonomy and functionality of robots. However, it's essential to balance the desire for local AI processing with the practical considerations of computational resources, cost, and maintenance. Cloud-based AI services currently offer scalability and access to powerful computing infrastructure that may not be feasible for individual users to replicate locally.
As for the integration of AI models like LocalAI, Mixtral, or others into ARC, it would require careful consideration of their stability, performance, and long-term viability. Open-source projects can be volatile, and ensuring that any integration effort is sustainable and provides value to the Synthiam community is crucial.
In summary, the future of AI in robotics looks promising, with more specialized models and confidence measures likely playing a significant role. Synthiam ARC could potentially benefit from these advancements, provided that they align with the platform's goals and the needs of its users.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
@Athena, how would the system know what model to query for the response if domain models are implemented? For example, there could be multiple domain models for automotive, mechanical, medical, botany, etc. If I asked about flowers, how would the system know not to use the Mechanical model for the response? Also, what if the question involved multiple models because it was a cross-over between botany and biology? How would such a system know how to do that?
In a system with multiple domain-specific models, the key to selecting the appropriate model for a query lies in the use of an inference routing engine or a decision-making mechanism. Here's how such a system could work:
Inference Routing Engine: This component would act as the central hub for all incoming queries. It would analyze the content of each query to determine the most relevant domain(s) and direct the query to the appropriate model(s). The engine could use NLP techniques to extract keywords, topics, or intents from the query to make this decision.
Model Metadata: Each domain model could be accompanied by metadata that describes its scope, expertise, and the types of questions it is designed to answer. The inference routing engine would use this metadata to match queries with the most suitable model.
Cross-Domain Queries: For questions that span multiple domains, the inference routing engine could either:
Confidence Scoring: Each model could provide a confidence score along with its response, indicating how likely it is that the response is accurate and relevant. The inference routing engine could use these scores to decide whether to trust a single model's response or to seek additional input from other models.
Feedback Loop: The system could incorporate user feedback to improve the accuracy of the inference routing engine over time. If a user indicates that a response is off-topic or incorrect, the system could use this information to refine its model selection process.
Hybrid Approaches: In some cases, a hybrid model that combines domain-specific knowledge with more general AI capabilities could be used to handle queries that don't fit neatly into one category.
For integration with Synthiam ARC, creating a custom robot skill that implements such an inference routing engine would be necessary. This skill would manage the communication between ARC and the various domain models, ensuring that robot interactions are informed by the most relevant and accurate information available.
It's important to note that while this approach can enhance the precision of AI responses, it also introduces complexity in terms of system design, model management, and computational overhead. Balancing these factors will be crucial for the successful implementation of a multi-model AI system in robotics.
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
I just stuck Alexa (was thinking Amazon cloud equivalent of Athena) support and Siri (iPhone support) in two boxes. Just to represent two other areas of expertise.
how would the inference engine know what model to ask ? It would do it the same way we do. Ask the expert.
Inference engine selects obvious domains like biology ecology and merges models (that’s what mixtral does chooses best guess models) and this merge of domain specific models asks the expert models. Hey flowers know anything about carnivorous plants in Africa? Nope that’s not my area of expertise go try someone else or here is a referral.
Athena's response on routing made sense using an nlp. The NLP would be a pretty complicated configuration, but I can see it working specifically if a model identifies what models are needed for the routing. So maybe some routing model that has enough knowledge to know what models are relevant to the question. I like the feedback loop idea as well.