Locally running websearch using LLM chains
A high-throughput and memory-efficient inference and serving engine for LLMs.
Framework to create ChatGPT like bots over your dataset.
Create, deploy and operate Actions using Python anywhere to enhance your AI agents and assistants. Batteries included with an extensive set of libraries, helpers and logging.
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Easily build, version, evaluate and deploy your LLM-powered apps.
Open Source LLM Engineering Platform 🪢 Tracing, Evaluations, Prompt Management, Evaluations and Playground.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 11 - 10 = ?*
Save my name, email, and website in this browser for the next time I comment.
A high-throughput and memory-efficient inference and serving engine for LLMs.