Training Frameworks | Site Categories | LLMWay

veRL is a flexible and efficient RL framework for LLMs.

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Megatron-DeepSpeed 2,125

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

torchtune 5,369

A Native-PyTorch Library for LLM Fine-tuning.

torchtitan 4,158

A native PyTorch Library for large model training.

NeMo Framework 15,274

Generative AI framework built for researchers and PyTorch developers working on Large Language Models (LLMs), Multimodal Models (MMs), Automatic Speech Recognition (ASR), Text to Speech (TTS), and Computer Vision (CV) domains.

Megatron-LM 13,042

Ongoing research training transformer models at scale.

Colossal-AI 41,061

Making large AI models cheaper, faster, and more accessible.

BMTrain 604

Efficient Training for Big Models.

Mesh Tensorflow 1,613

Mesh TensorFlow: Model Parallelism Made Easier.

maxtext 1,848

A simple, performant and scalable Jax LLM!

GPT-NeoX 7,274

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Transformer Engine 2,595

A library for accelerating Transformer model training on NVIDIA GPUs.