Blazingly fast LLM inference.
Inference for text-embeddings in Rust, HFOIL Licence.
Use ChatGPT On Wechat via wechaty
NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)
A high-throughput and memory-efficient inference and serving engine for LLMs.
Lightweight alternative to LangChain for composing LLMs
A Device-Inference framework, including LLM Inference on device(Mobile Phone/PC/IOT)
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 13 + 12 = ?*
Save my name, email, and website in this browser for the next time I comment.
Inference for text-embeddings in Rust, HFOIL Licence.