
Training Frameworks
GPT-NeoX
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
A simple, performant and scalable Jax LLM!