Megatron-DeepSpeed
Training Frameworks
Megatron-DeepSpeed

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.

Relevant Sites

Leave a Reply

Your email address will not be published. Required fields are marked *