
Milestone Papers
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
(2021-01) Switch Transformers by Google
(2021-01) Switch Transformers by Google
(2025-1) DeepSeek-R1 by DeepSeek