Easily build, version, evaluate and deploy your LLM-powered apps.
NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.
Building applications with LLMs through composability
a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!
Simple API for deploying any RAG or LLM that you want adding plugins.
Get up and running with Llama 3, Mistral, Gemma, and other large language models.
First LLM Multi-agent framework.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 19 - 15 = ?*
Save my name, email, and website in this browser for the next time I comment.
NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.