
Inference Engines
Serge
a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!
a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!
A high-throughput and low-latency inference and serving framework for LLMs and VLs