Blazingly fast LLM inference.
Playground for devs to finetune & deploy LLMs
NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.
Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
Inference for text-embeddings in Rust, HFOIL Licence.
SGLang is a fast serving framework for large language models and vision language models.
Formerly langchain-ChatGLM, local knowledge based LLM (like ChatGLM) QA app with langchain.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 15 - 15 = ?*
Save my name, email, and website in this browser for the next time I comment.
Playground for devs to finetune & deploy LLMs