FasterTransformer | LLMWay

Inference Engines

FasterTransformer

NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)

GitHub

NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)

Relevant Sites

Opik 15,537

Confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.

Infinity 2,535

Inference for text-embeddings in Python

OpenLLM 11,924

Fine-tune, serve, deploy, and monitor any open-source LLMs in production. Used in production at BentoML for LLMs-based applications.

Serge 5,754

a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

Shell-Pilot 109

Interact with LLM using Ollama models(or openAI, mistralAI)via pure shell scripts on your Linux(or MacOS) system, enhancing intelligent system management without any dependencies.

Langchain-Chatchat 36,472

Formerly langchain-ChatGLM, local knowledge based LLM (like ChatGLM) QA app with langchain.

Relevant Sites

Leave a Reply Cancel reply