DeepSpeed-Mii | LLMWay – The Way To LLM

Inference Engines

DeepSpeed-Mii

MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.

GitHub

MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.

magentic 2,378

Seamlessly integrate LLMs as Python functions

SkyPilot 8,926

Run LLMs and batch jobs on any cloud. Get maximum cost savings, highest GPU availability, and managed execution -- all with a simple interface.

MNN-LLM 13,446

A Device-Inference framework, including LLM Inference on device(Mobile Phone/PC/IOT)

Agenta 3,347

Easily build, version, evaluate and deploy your LLM-powered apps.

Search with Lepton 8,130

Build your own conversational search engine using less than 500 lines of code by LeptonAI.