Mesh TensorFlow: Model Parallelism Made Easier.
Making large AI models cheaper, faster, and more accessible.
DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Ongoing research training transformer models at scale.
A library for accelerating Transformer model training on NVIDIA GPUs.
A Native-PyTorch Library for LLM Fine-tuning.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 12 - 18 = ?*
Save my name, email, and website in this browser for the next time I comment.
Making large AI models cheaper, faster, and more accessible.