LLMWay – The Way To LLM | LLMWay is your ultimate guide to navigating the landscape of Large Language Models (LLMs). Discover comprehensive resources, expert insights, and practical applications that empower you to leverage LLM technology effectively. Whether you are a developer, researcher, or enthusiast, LLMWay provides the tools and knowledge you need to succeed in the ever-evolving world of artificial intelligence. Join us to explore tutorials, case studies, and the latest trends in LLM development.

LLM Trends - Leaderboard (31)

a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.

Llama is an accessible, open large language model (LLM) designed for developers, researchers, and businesses to build, experiment, and responsibly scale their generative AI ideas. Part of a foundational system, it serves as a bedrock for innovation in the global community. A few key aspects: Open access: Easy accessibility to cutting-edge large language models, fostering […]

Qwen 23,437

Qwen2.5 is a series of large language models developed by the Alibaba Cloud Intelligence team, designed to provide powerful natural language processing capabilities. Here are some key features and advantages of the product: Model Scale: The Qwen2.5 series includes multiple model scales, ranging from 0.5B to 72B parameters, catering to different scenarios and needs. Pre-training […]

DeepSeek V3 98,400

DeepSeek-V3, a powerful Mixture-of-Experts (MoE) language model with a total of 671B parameters and 37B parameters activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts the Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which have been thoroughly verified in DeepSeek-V2. Moreover, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets […]

LLM Learning - Milestone Papers (58)

more+

DeepSeek-R1 90,681

(2025-1) DeepSeek-R1 by DeepSeek

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

(2024-05) Mamba2 by CMU&Princeton

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

(2024-01) DeepSeek-v2 by DeepSeek

LLM Inference - Inference Engines (49)

more+

vLLM 53,347

A high-throughput and memory-efficient inference and serving engine for LLMs.

SGLang 16,328

SGLang is a fast serving framework for large language models and vision language models.

LMDeploy 6,771

A high-throughput and low-latency inference and serving framework for LLMs and VLs

ollama 147,709

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

Nanoflow 850

NanoFlow is a throughput-oriented high-performance serving framework for LLMs. NanoFlow consistently delivers superior throughput compared to vLLM, Deepspeed-FastGen, and TensorRT-LLM.

llama.cpp 83,545

LLM inference in C/C++.

Langfuse 14,208

Open Source LLM Engineering Platform 🪢 Tracing, Evaluations, Prompt Management, Evaluations and Playground.

LLM Training - Training Frameworks (13)

more+

veRL 11,521

veRL is a flexible and efficient RL framework for LLMs.

DeepSpeed 39,513

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Megatron-DeepSpeed 2,121

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

torchtune 5,362

A Native-PyTorch Library for LLM Fine-tuning.

torchtitan 4,122

A native PyTorch Library for large model training.

NeMo Framework 15,228

Generative AI framework built for researchers and PyTorch developers working on Large Language Models (LLMs), Multimodal Models (MMs), Automatic Speech Recognition (ASR), Text to Speech (TTS), and Computer Vision (CV) domains.

Megatron-LM 12,998

Ongoing research training transformer models at scale.

Colossal-AI 41,049

Making large AI models cheaper, faster, and more accessible.