
Inference Engines
DeepSpeed-Mii
MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.
MII makes low-latency and high-throughput inference, similar to vLLM powered by DeepSpeed.
Playground for devs to finetune & deploy LLMs