Megatron-LM | LLMWay – The Way To LLM

Training Frameworks

Megatron-LM

Ongoing research training transformer models at scale.

GitHub

Ongoing research training transformer models at scale.

Relevant Sites

torchtune 5,504

A Native-PyTorch Library for LLM Fine-tuning.

Megatron-DeepSpeed 2,162

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

Mesh Tensorflow 1,615

Mesh TensorFlow: Model Parallelism Made Easier.

DeepSpeed 40,162

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

BMTrain 609

Efficient Training for Big Models.

Colossal-AI 41,163

Making large AI models cheaper, faster, and more accessible.

Relevant Sites

Leave a Reply Cancel reply