Nvidia Framework for LLM Inference
Build your own conversational search engine using less than 500 lines of code by LeptonAI.
Locally running websearch using LLM chains
Blazingly fast LLM inference.
A distributed multi-model LLM serving system with web UI and OpenAI-compatible RESTful APIs.
A python package for Txt-to-SQL with self hosting functionalities and RESTful APIs compatible with proprietary as well as open source LLM.
SGLang is a fast serving framework for large language models and vision language models.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 17 - 16 = ?*
Save my name, email, and website in this browser for the next time I comment.
Build your own conversational search engine using less than 500 lines of code by LeptonAI.