Inference Engines
FasterTransformer
NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)
NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)
Confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.