Ongoing research training transformer models at scale.
Making large AI models cheaper, faster, and more accessible.
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
A library for accelerating Transformer model training on NVIDIA GPUs.
A simple, performant and scalable Jax LLM!
Generative AI framework built for researchers and PyTorch developers working on Large Language Models (LLMs), Multimodal Models (MMs), Automatic Speech Recognition (ASR), Text to Speech (TTS), and Computer Vision (CV) domains.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Captcha: 14 + 15 = ?*
Save my name, email, and website in this browser for the next time I comment.
Making large AI models cheaper, faster, and more accessible.