GPT-NeoX | LLMWay – The Way To LLM

Training Frameworks

GPT-NeoX

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

GitHub

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Relevant Sites

torchtune 5,581

A Native-PyTorch Library for LLM Fine-tuning.

Megatron-DeepSpeed 2,188

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

Megatron-LM 14,140

Ongoing research training transformer models at scale.

BMTrain 613

Efficient Training for Big Models.

DeepSpeed 40,634

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Colossal-AI 41,228

Making large AI models cheaper, faster, and more accessible.

Relevant Sites

Leave a Reply Cancel reply