Framework to create ChatGPT like bots over your dataset.
An open-source GPU cluster manager for running LLMs
Get up and running with Llama 3, Mistral, Gemma, and other large language models.
Comprehensive set of tools for working with local LLMs for various tasks.
Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.
A method designed to enhance the efficiency of Transformer models
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 13 + 14 = ?*
Save my name, email, and website in this browser for the next time I comment.
An open-source GPU cluster manager for running LLMs