OpenLLM | LLMWay – The Way To LLM

Inference Engines

OpenLLM

Fine-tune, serve, deploy, and monitor any open-source LLMs in production. Used in production at BentoML for LLMs-based applications.

GitHub

Fine-tune, serve, deploy, and monitor any open-source LLMs in production. Used in production at BentoML for LLMs-based applications.

Relevant Sites

Text-Embeddings-Inference 4,004

Inference for text-embeddings in Rust, HFOIL Licence.

exllama 2,897

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

vLLM 57,880

A high-throughput and memory-efficient inference and serving engine for LLMs.

Flash-Attention 19,451

A method designed to enhance the efficiency of Transformer models

Search with Lepton 8,133

Build your own conversational search engine using less than 500 lines of code by LeptonAI.

Shell-Pilot 107

Interact with LLM using Ollama models(or openAI, mistralAI)via pure shell scripts on your Linux(or MacOS) system, enhancing intelligent system management without any dependencies.

Relevant Sites

Leave a Reply Cancel reply