Playground for devs to finetune & deploy LLMs
An open-source GPU cluster manager for running LLMs
Build your own conversational search engine using less than 500 lines of code by LeptonAI.
Gateway streamlines requests to 100+ open & closed source models with a unified API. It is also production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.
WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
a toolkit for deploying and serving Large Language Models (LLMs).
Run LLMs and batch jobs on any cloud. Get maximum cost savings, highest GPU availability, and managed execution -- all with a simple interface.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 12 + 13 = ?*
Save my name, email, and website in this browser for the next time I comment.
An open-source GPU cluster manager for running LLMs