TensorRT-LLM | LLMWay – The Way To LLM

Inference Engines

TensorRT-LLM

Nvidia Framework for LLM Inference

GitHub

Nvidia Framework for LLM Inference

Relevant Sites

Nanoflow 887

NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.

vLLM 57,880

A high-throughput and memory-efficient inference and serving engine for LLMs.

MindSQL 422

A python package for Txt-to-SQL with self hosting functionalities and RESTful APIs compatible with proprietary as well as open source LLM.

Embedchain 39,761

Framework to create ChatGPT like bots over your dataset.

promptfoo 8,361

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Relevant Sites

Leave a Reply Cancel reply