FastChat | LLMWay – The Way To LLM

Inference Engines

FastChat

A distributed multi-model LLM serving system with web UI and OpenAI-compatible RESTful APIs.

GitHub

A distributed multi-model LLM serving system with web UI and OpenAI-compatible RESTful APIs.

Relevant Sites

mistral.rs 6,211

Blazingly fast LLM inference.

Robocorp 585

Create, deploy and operate Actions using Python anywhere to enhance your AI agents and assistants. Batteries included with an extensive set of libraries, helpers and logging.

GPUStack 3,973

An open-source GPU cluster manager for running LLMs

OpenLLM 11,924

Fine-tune, serve, deploy, and monitor any open-source LLMs in production. Used in production at BentoML for LLMs-based applications.

talkd.ai dialog 425

Simple API for deploying any RAG or LLM that you want adding plugins.

Infinity 2,535

Inference for text-embeddings in Python

Relevant Sites

Leave a Reply Cancel reply