A native PyTorch Library for large model training.
DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
A simple, performant and scalable Jax LLM!
Ongoing research training transformer models at scale.
A Native-PyTorch Library for LLM Fine-tuning.
Generative AI framework built for researchers and PyTorch developers working on Large Language Models (LLMs), Multimodal Models (MMs), Automatic Speech Recognition (ASR), Text to Speech (TTS), and Computer Vision (CV) domains.
Mesh TensorFlow: Model Parallelism Made Easier.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 16 + 12 = ?*
Save my name, email, and website in this browser for the next time I comment.
DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.