Locally running websearch using LLM chains
a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!
Inference for text-embeddings in Python
Create, deploy and operate Actions using Python anywhere to enhance your AI agents and assistants. Batteries included with an extensive set of libraries, helpers and logging.
MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.
Simple API for deploying any RAG or LLM that you want adding plugins.
Gateway streamlines requests to 100+ open & closed source models with a unified API. It is also production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 14 + 20 = ?*
Save my name, email, and website in this browser for the next time I comment.
a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!