
Inference Engines
Swiss Army Llama
Comprehensive set of tools for working with local LLMs for various tasks.
Comprehensive set of tools for working with local LLMs for various tasks.
MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.