A simple, performant and scalable Jax LLM!
Making large AI models cheaper, faster, and more accessible.
veRL is a flexible and efficient RL framework for LLMs.
Efficient Training for Big Models.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 18 - 16 = ?*
Save my name, email, and website in this browser for the next time I comment.
Making large AI models cheaper, faster, and more accessible.